# How Query Engines Work ![rw-book-cover](https://m.media-amazon.com/images/I/91cVfd+nAjL._SY160.jpg) ## Metadata - Author: Andy Grove - Full Title: How Query Engines Work - Category: #books ## Highlights - SQL is powerful and widely understood but has limitations in the world of so-called “Big Data,” where data scientists often need to mix in custom code with their queries. ([Location 139](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=139)) - Here is an example that demonstrates how Apache Spark can be used to perform a simple aggregate query against a Parquet data set. The real power of Spark is that this query can be run on a laptop or on a cluster of hundreds of servers with no code changes required. ([Location 141](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=141)) - Query engines provide a set of standard operations and transformations that the end-user can combine in different ways through a simple query language or application programming interface and are tuned for good performance. ([Location 161](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=161)) - Apache Arrow started as a specification for a memory format for columnar data, with implementations in Java and C++. The memory format is efficient for vectorized processing on modern hardware such as CPUs with SIMD (Single Instruction, Multiple Data) support and GPUs. ([Location 176](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=176)) - High-level languages such as Python or Java can make calls into lower-level languages such as Rust or C++ for compute-intensive tasks by passing pointers to the data, rather than making a copy of the data in a different format, which would be very expensive. ([Location 179](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=179)) - Data can be transferred between processes efficiently without much serialization overhead because the memory format is also the network format (… ([Location 180](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=180)) - It should make it easier to build connectors, drivers, and integrations between various open-source and commercial projects in the data science and data analytics space and allow developers to use… ([Location 182](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=182)) - The memory model is described in detail on the Arrow web site, but essentially each column is represented by a single vector holding the raw data, along with separate vectors representing null values and… ([Location 185](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=185)) - As I mentioned earlier, data can be passed between processes by passing a pointer to the data. However, the receiving process needs to know how to interpret this data, so an IPC format is defined for exchanging metadata such as schema… ([Location 188](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=188)) - Dremio recently donated Gandiva, which is a Java library that compiles expressions down to… ([Location 194](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=194)) - More recently, Arrow has defined a Flight protocol for efficiently streaming Arrow data over the network. Flight is based on… ([Location 196](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=196)) - Handshake between client and server. Depending on the server, the handshake may be required to determine the token that… ([Location 199](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=199)) - Get a list of available streams given a particular criteria. Most flight services will expose one or more streams that are… ([Location 201](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=201)) - For a given FlightDescriptor, get information about how the flight can be consumed. This is a useful interface if the consumer of the interface can already… ([Location 205](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=205)) - For a given FlightDescriptor, get the Schema as described in Schema.fbs::Schema. This is used when a consumer… ([Location 210](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=210)) - Retrieve a single stream associated with a particular descriptor associated with… ([Location 212](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=212)) - Push a stream to the flight service associated with a… ([Location 214](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=214)) - Open a bidirectional data channel for a given descriptor. This allows clients to send and receive arbitrary Arrow data and application-specific metadata in a single logical stream. In contrast to DoGet/DoPut, this is more suited for clients… ([Location 218](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=218)) - Flight services can support an arbitrary number of simple actions in addition to the possible ListFlights, GetFlightInfo, DoGet, DoPut… ([Location 220](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=220)) - A flight service exposes all of the available action types that it has… ([Location 223](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=223)) - There is a proposal to add SQL capabilities to Arrow Flight. At the time of writing (Jan 2021), there is a PR up for a C++ implementation… ([Location 225](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=225)) - The Rust implementation of Arrow contains an in-memory query engine named DataFusion, which was donated to the project in 2019. This project is maturing rapidly and is gaining traction. For example, InfluxData is building the core… ([Location 228](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=228)) - Ballista is a distributed compute platform primarily implemented in Rust, and… ([Location 231](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=231)) - The foundational technologies in Ballista are: Apache Arrow for the memory model and type system. Apache Arrow Flight protocol for efficient data transfer between processes. Apache Arrow Flight SQL protocol for use by business intelligence tools and JDBC drivers to connect to a Ballista cluster Google Protocol Buffers for serializing query plans. Docker for packaging up executors along… ([Location 233](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=233)) - The first step in building a query engine is to choose a type system to represent the different types of data that the query engine will be processing. ([Location 243](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=243)) - if these batches represent columnar data rather than rows, it is possible to use “vectorized processing” and take advantage of SIMD (Single Instruction Multiple Data) to process multiple values within a column with a single CPU instruction. This concept can be taken even further by leveraging GPUs to process much larger quantities of data in parallel. ([Location 253](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=253)) - A query engine is of little use without a data source to read from and we want the ability to support multiple data sources, so it is important to create an interface that the query engine can use to interact with data sources. ([Location 366](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=366)) - During query planning, it is important to understand the schema of the data source so that the query plan can be validated to make sure that referenced columns exist and that data types are compatible with the expressions being used to reference them. ([Location 370](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=370)) - CSV files are text files with one record per line and fields are separated with commas, hence the name “Comma Separated Values”. ([Location 389](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=389)) - The JavaScript Object Notation format (JSON) is another popular text-based file format. Unlike CSV files, JSON files are structured and can store complex nested data types. ([Location 391](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=391)) - Parquet was created to provide a compressed, efficient columnar data representation and is a popular file format in the Hadoop ecosystem. ([Location 393](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=393)) - Parquet files contain schema information and data is stored in batches (referred to as “row groups”) where each batch consists of columns. ([Location 396](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=396)) - The Optimized Row Columnar (Orc) format is similar to Parquet. Data is stored in columnar batches called “stripes”. ([Location 399](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=399)) - A logical plan represents a relation (a set of tuples) with a known schema. Each logical plan can have zero or more logical plans as inputs. It is convenient for a logical plan to expose its child plans so that a visitor pattern can be used to walk through the plan. ([Location 403](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=403)) - It is important to be able to print logical plans in human-readable form to help with debugging. ([Location 412](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=412)) - It is sometimes desirable to be able to serialize query plans so that they can easily be transferred to another process. ([Location 445](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=445)) - It is good practice to add serialization early on as a precaution against accidentally referencing data structures that cannot be serialized (such as file handles or database connections). ([Location 446](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=446)) - Since publishing the first edition of this book, a new standard named “substrait” has emerged, with the goal of providing cross-language serialization for relational algebra. ([Location 451](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=451)) - When we are planning queries, we will need to know some basic metadata about the output of an expression. Specifically, we need to have a name for the expression so that other expressions can reference it and we need to know the data type of the values that the expression will produce when evaluated so that we can validate that the query plan is valid. ([Location 465](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=465)) - The Column expression simply represents a reference to a named column. The metadata for this expression is derived by finding the named column in the input and returning that column’s metadata. ([Location 476](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=476)) - Binary expressions are simply expressions that take two inputs. There are three categories of binary expressions that we will implement, and those are comparison expressions, Boolean expressions, and math expressions. ([Location 534](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=534)) - The base class also provides a concise way to implement the concrete Boolean logic expressions. ([Location 611](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=611)) - Math expressions are another specialization of a binary expression. Math expressions typically operate on values of the same data type and produce a result of the same data type. ([Location 624](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=624)) - Aggregate expressions perform an aggregate function such as MIN, MAX, COUNT, SUM, or AVG on an input expression. ([Location 663](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=663)) - With the logical expressions in place, we can now implement the logical plans for the various transformations that the query engine will support. ([Location 716](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=716)) - The Scan logical plan represents fetching data from a DataSource with an optional projection. Scan is the only logical plan in our query engine that does not have another logical plan as an input. It is a leaf node in the query tree. ([Location 717](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=717)) - The Projection logical plan applies a projection to its input. A projection is a list of expressions to be evaluated against the input data. ([Location 764](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=764)) - The selection logical plan applies a filter expression to determine which rows should be selected (included) in its output. ([Location 795](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=795)) - The Aggregate logical plan is more complex than Projection, Selection, or Scan and calculates aggregates of underlying data such as calculating minimum, maximum, averages, and sums of data. ([Location 821](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=821)) - Note that in this implementation, the output of the aggregate plan is organized with grouping columns followed by aggregate expressions. It will often be necessary to wrap the aggregate logical plan in a projection so that columns are returned in the order requested in the original query. ([Location 854](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=854)) - A DataFrame is just an abstraction around a logical query plan and has methods to perform transformations and actions. It is similar to a fluent-style builder API. ([Location 925](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=925)) - Before we can apply a projection or selection, we need a way to create an initial DataFrame that represents an underlying data source. This is usually obtained through an execution context. ([Location 986](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=986)) - The logical plans defined in chapter five specify what to do but not how to do it, and it is good practice to have separate logical and physical plans, although it is possible to combine them to reduce complexity. ([Location 1083](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1083)) - One reason to keep logical and physical plans separate is that sometimes there can be multiple ways to execute a particular operation, meaning that there is a one-to-many relationship between logical plans and physical plans. For example, there could be separate physical plans for single-process versus distributed execution, or CPU versus GPU execution. Also, operations such as Aggregate and Join can be implemented with a variety of algorithms with different performance trade-offs. ([Location 1085](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1085)) - There could be multiple physical expression implementations for each logical expression. For example, for the logical expression AddExpr that adds two numbers, we could have one implementation that uses the CPU and one that uses the GPU. The query planner could choose which one to use based on the hardware capabilities of the server that the code is running on. ([Location 1105](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1105)) - The logical expression for Column references inputs by name, which is user-friendly for writing queries, but for the physical expression we want to avoid the cost of name lookups every time the expression is evaluated, so it references columns by index instead. ([Location 1115](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1115)) - The physical implementation of a literal expression is simply a literal value wrapped in a class that implements the appropriate trait and provides the same value for every index in a column. ([Location 1133](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1133)) - Aggregate expressions are more complex because they aggregate values across multiple batches of data and then produce one final value, so we need to introduce the concept of accumulators, and the physical representation of each aggregate expression needs to know how to produce an appropriate accumulator for the query engine to pass input data to. ([Location 1366](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1366)) - The Scan execution plan simply delegates to a data source, passing in a projection to limit the number of columns to load into memory. No additional logic is performed. ([Location 1466](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1466)) - The projection execution plan simply evaluates the projection expressions against the input columns and then produces a record batch containing the derived columns. ([Location 1498](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1498)) - The selection execution plan is the first non-trivial plan, since it has conditional logic to determine which rows from the input record batch should be included in the output batches. ([Location 1536](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1536)) - The HashAggregate plan is more complex than the previous plans because it must process all incoming batches and maintain a HashMap of accumulators and update the accumulators for each row being processed. ([Location 1613](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1613)) - Correlated subqueries are translated into joins before execution (this is explained in chapter ([Location 1767](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1767)) - Uncorrelated queries can be executed individually and the resulting value can be substituted into the top-level query. ([Location 1768](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1768)) - We have defined logical and physical query plans, and now we need a query planner that can translate the logical plan into the physical plan. ([Location 1779](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1779)) - The query planner may choose different physical plans based on configuration options or based on the target platform’s hardware capabilities. ([Location 1780](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1780)) - The logical Column expression references columns by name, but the physical expression uses column indices for improved performance, so the query planner needs to perform the translation from column name to column index and throw an exception if the column name is not valid. ([Location 1807](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1807)) - The physical expressions for literal values are straightforward, and the mapping from logical to physical expression is trivial because we need to copy the literal value over. ([Location 1825](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1825)) - To create a physical expression for a binary expression we first need to create the physical expression for the left and right inputs and then we need to create the specific physical expression. ([Location 1835](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1835)) - We need to implement a recursive function to walk the logical plan tree and translate it into a physical plan, using the same pattern described earlier for translating expressions. ([Location 1881](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1881)) - Translating the Scan plan simply requires copying the data source reference and the logical plan’s projection. ([Location 1892](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1892)) - There are two steps to translating a projection. First, we need to create a physical plan for the projection’s input, and then we need to convert the projection’s logical expressions to physical expressions. ([Location 1897](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1897)) - The query planning step for Selection is very similar to Projection. ([Location 1915](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1915)) - The query planning step for aggregate queries involves evaluating the expressions that define the optional grouping keys and evaluating the expressions that are the inputs to the aggregate functions, and then creating the physical aggregate expressions. ([Location 1927](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1927)) - Rule based optimizations are a simple and pragmatic approach to apply common sense optimizations to a query plan. These optimizations are typically executed against the logical plan before the physical plan is created, although rule-based optimizations can also be applied to physical plans. ([Location 1970](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1970)) - The goal of the projection push-down rule is to filter out columns as soon as possible after reading data from disk and before other phases of query execution, to reduce the amount of data that is kept in memory (and potentially transfered over the network in the case of distributed queries) between operators. ([Location 1982](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=1982)) - The Predicate Push-Down optimization aims to filter out rows as early as possible within a query, to avoid redundant processing. ([Location 2110](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2110)) - Given a query such as SELECT sum(price * qty) as total_price, sum(price * qty * tax_rate) as total_tax FROM ..., we can see that the expression price * qty appears twice. Rather than perform this computation twice, we could choose to re-write the plan to compute it once. ([Location 2151](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2151)) - Given a query such as SELECT id FROM foo WHERE EXISTS (SELECT * FROM bar WHERE foo.id = bar.id), a simple implementation would be to scan all rows in foo and then perform a lookup in bar for each row in foo. This would be extremely inefficient, so query engines typically translate correlated subqueries into joins. This is also known as subquery decorrelation. ([Location 2172](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2172)) - Cost-based optimization refers to optimization rules that use statistics about the underlying data to determine a cost of executing a particular query and then choose an optimal execution plan by looking for one with a low cost. Good examples would be choosing which join algorithm to use, or choosing which order tables should be joined in, based on the sizes of the underlying tables. ([Location 2204](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2204)) - One major drawback to cost-based optimizations is that they depend on the availability of accurate and detailed statistics about the underlying data. ([Location 2206](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2206)) - Here is a bare-bones implementation of a Pratt parser. In my opinion, it is beautiful in its simplicity. Expression parsing is performed by a simple loop that parses a “prefix” expression followed by an optional “infix” expression and keeps doing this until the precedence changes in such a way that the parser recognizes that it has finished parsing the expression. ([Location 2438](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2438)) - Implementing the nextPrecedence method is simple because we only have a small number of tokens that have any precedence here and we need to have the multiplication and division operators have higher precedence than the addition and subtraction operator. ([Location 2527](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2527)) - concrete syntax tree (CST). Note that with other approaches to parsing such as using a parser generator like ANTLR there is an intermediate stage known as an Abstract Syntax Tree (AST) which then needs to be translated to a Concrete Syntax Tree but with the Pratt Parser approach we go directly from tokens to the CST. ([Location 2610](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2610)) - A simpler and more generic approach would be to add all the required expressions to the projection so that the selection can be applied after the projection, and then remove any columns that were added by wrapping the output in another projection. ([Location 2645](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2645)) - The simplest form of distributed query execution is parallel query execution utilizing multiple CPU cores on a single node using threads. ([Location 2853](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2853)) - Although the strategy of using one thread per file worked well in this example, it does not work as a general-purpose approach to partitioning. If a data source has thousands of small partitions, starting one thread per partition would be inefficient. A better approach would be for the query planner to decide how to share the available data between a specified number of worker threads (or executors). ([Location 2992](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=2992)) - One solution to this problem is to place files in directories and use directory names consisting of key-value pairs to specify the contents. ([Location 3003](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3003)) - When performing an inner join with a single thread, a simple approach is to load one side of the join into memory and then scan the other side, performing lookups against the data stored in memory. This classic Hash Join algorithm is efficient if one side of the join can fit into memory. ([Location 3010](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3010)) - The parallel version of this is known as a Partitioned Hash Join or Parallel Hash Join. It involves partitioning both inputs based on the join keys and performing a classic Hash Join on each pair of input partitions. ([Location 3013](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3013)) - Certain operators can run in parallel on partitions of data without any significant overhead when running in a distributed environment. The best examples of these are Projection and Filter. These operators can be applied in parallel to each input partition of the data being operated on and produce a corresponding output partition for each one. ([Location 3021](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3021)) - Joins are often the most expensive operation to perform in a distributed environment. The reason for this is that we need to make sure that we organize the data in such a way that both input relations are partitioned on the join keys. ([Location 3052](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3052)) - Distributed query plans are fundamentally different to in-process query plans because we can’t just build a tree of operators and start executing them. The query now requires co-ordination across executors which means that we now need to build a scheduler. ([Location 3060](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3060)) - At a high level, the concept of a distributed query scheduler is not complex. The scheduler needs to examine the whole query and break it down into stages that can be executed in isolation (usually in parallel across the executors) and then schedule these stages for execution based on the available resources in the cluster. Once each query stage completes then any subsequent dependent query stages can be scheduled. ([Location 3062](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3062)) - However, there are advantages in using a serialization format that is programming language-agnostic. Ballista uses Google’s Protocol Buffers format to define query plans. The project is typically abbreviated as “protobuf”. ([Location 3154](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3154)) - Apache Arrow provides an IPC (Inter-process Communication) format for exchanging data between processes. Because of the standardized memory layout provided by Arrow, the raw bytes can be transferred directly between memory and an input/output device (disk, network, etc) without the overhead typically associated with serialization. ([Location 3195](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3195)) - Apache Arrow provides a Flight protocol which is intended for this exact purpose. Flight is a new general-purpose client-server framework to simplify high performance transport of large datasets over network interfaces. ([Location 3205](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3205)) - The Arrow Flight libraries provide a development framework for implementing a service that can send and receive data streams. A Flight server supports several basic kinds of requests: Handshake: a simple request to determine whether the client is authorized and, in some cases, to establish an implementation-defined session token to use for future requests ListFlights: return a list of available data streams GetSchema: return the schema for a data stream GetFlightInfo: return an “access plan” for a dataset of interest, possibly requiring consuming multiple data streams. This request can accept custom serialized commands containing, for example, your specific application parameters. DoGet: send a data stream to a client DoPut: receive a data stream from a client DoAction: perform an implementation-specific action and return any results, i.e. a generalized function call ListActions: return a list of available action types ([Location 3208](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3208)) - Distributed query execution has a lot of overhead compared to parallel query execution on a single host and should only be used when there is benefit in doing so. ([Location 3230](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3230)) - Much of the complexity of query engines comes from the fact that operators and expressions can be combined through infinite combinations due to the nested nature of operator and expression trees, and it is unlikely that hand-coding test queries will be comprehensive enough. ([Location 3356](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3356)) - When measuring scalability based on number of concurrent requests, we are often more interested in throughput (total number of queries executed per period of time) rather than the duration of individual queries, although we typically would collect that information as well. ([Location 3505](https://readwise.io/to_kindle?action=open&asin=B09GP1C6CQ&location=3505))