Package: arrow 18.1.0

Jonathan Keane

arrow: Integration to 'Apache' 'Arrow'

'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.

Authors:Neal Richardson [aut], Ian Cook [aut], Nic Crane [aut], Dewey Dunnington [aut], Romain François [aut], Jonathan Keane [aut, cre], Dragoș Moldovan-Grünfeld [aut], Jeroen Ooms [aut], Jacob Wujciak-Jens [aut], Javier Luraschi [ctb], Karl Dunkle Werner [ctb], Jeffrey Wong [ctb], Apache Arrow [aut, cph]

arrow_18.1.0.tar.gz
arrow_18.1.0.tar.gz(r-4.5-noble)arrow_18.1.0.tar.gz(r-4.4-noble)
arrow.pdf |arrow.html
arrow/json (API)
NEWS

# Install 'arrow' in R:
install.packages('arrow', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/apache/arrow/issues

Pkgdown site:https://arrow.apache.org

Uses libs:
  • curl– Easy-to-use client-side URL transfer library
  • openssl– Secure Sockets Layer toolkit
  • c++– GNU Standard C++ Library v3

curlopensslcpp

11.96 score 3 stars 69 packages 11k scripts 381k downloads 6 mentions 240 exports 14 dependencies

Last updated 26 days agofrom:560342ae0f. Checks:OK: 2. Indexed: no.

TargetResultDate
Doc / VignettesOKDec 05 2024
R-4.5-linux-x86_64OKDec 05 2024

Exports:all_ofArrayarrow_arrayarrow_availablearrow_infoarrow_tablearrow_with_aceroarrow_with_datasetarrow_with_gcsarrow_with_jsonarrow_with_parquetarrow_with_s3arrow_with_substraitas_arrow_arrayas_arrow_tableas_chunked_arrayas_data_typeas_record_batchas_record_batch_readeras_schemabinaryboolbooleanbufferBufferBufferOutputStreamBufferReadercall_functioncast_optionschunked_arrayChunkedArrayCodeccodec_is_availableCompressedInputStreamCompressedOutputStreamCompressionTypeconcat_arraysconcat_tablescontainscopy_filescpu_countcreate_package_with_all_dependenciescsv_convert_optionscsv_parse_optionscsv_read_optionscsv_write_optionsCsvConvertOptionsCsvFileFormatCsvFragmentScanOptionsCsvParseOptionsCsvReadOptionsCsvTableReaderCsvWriteOptionsDatasetdataset_factoryDatasetFactorydate32date64DateUnitdecimaldecimal128decimal256default_memory_pooldictionaryDictionaryArrayDirectoryPartitioningDirectoryPartitioningFactorydurationends_witheverythingExpressionExtensionArrayExtensionTypeFeatherReaderfieldFieldFileFormatFileInfoFileModeFileOutputStreamFileSelectorFileSystemFileSystemDatasetFileSystemDatasetFactoryFileTypefixed_size_binaryfixed_size_list_ofFixedSizeListArrayFixedSizeListTypeflight_connectflight_disconnectflight_getflight_path_existsflight_putfloatfloat16float32float64FragmentScanOptionsGcsFileSystemgs_buckethalffloathive_partitionHivePartitioningHivePartitioningFactoryinfer_schemainfer_typeInMemoryDatasetinstall_arrowinstall_pyarrowint16int32int64int8io_thread_countIpcFileFormatis_inJoinTypeJsonFileFormatJsonFragmentScanOptionsJsonParseOptionsJsonReadOptionsJsonTableReaderlarge_binarylarge_list_oflarge_utf8LargeListArraylast_collist_compute_functionslist_flightslist_ofListArrayload_flight_serverLocalFileSystemmap_batchesmap_ofMapArrayMapTypematch_arrowmatchesMemoryMappedFileMessageReaderMessageTypeMetadataVersionmmap_createmmap_opennew_extension_arraynew_extension_typenullNullEncodingBehaviorNullHandlingBehaviornum_rangeone_ofopen_csv_datasetopen_datasetopen_delim_datasetopen_tsv_datasetParquetArrowReaderPropertiesParquetFileFormatParquetFileReaderParquetFileWriterParquetFragmentScanOptionsParquetReaderPropertiesParquetVersionTypeParquetWriterPropertiesPartitioningQuantileInterpolationRandomAccessFileread_csv_arrowread_csv2_arrowread_delim_arrowread_featherread_ipc_fileread_ipc_streamread_json_arrowread_messageread_parquetread_schemaread_tsv_arrowReadableFilerecord_batchRecordBatchRecordBatchFileReaderRecordBatchFileWriterRecordBatchReaderRecordBatchStreamReaderRecordBatchStreamWriterregister_extension_typeregister_scalar_functionreregister_extension_typeRoundModes3_bucketS3FileSystemscalarScalarScannerScannerBuilderschemaSchemaset_cpu_countset_io_thread_countshow_exec_planstarts_withStatusCodestringstructStructArrayStructScalarSubTreeFileSystemTabletime32time64timestampTimestampParserTimeUnitto_arrowto_duckdbtypeTypeuint16uint32uint64uint8unify_schemasUnionDatasetunregister_extension_typeutf8value_countsvctrs_extension_arrayvctrs_extension_typewrite_csv_arrowwrite_csv_datasetwrite_datasetwrite_delim_datasetwrite_featherwrite_ipc_filewrite_ipc_streamwrite_parquetwrite_to_rawwrite_tsv_dataset

Dependencies:assertthatbitbit64clicpp11gluelifecyclemagrittrpurrrR6rlangtidyselectvctrswithr

Readme and manuals

Help Manual

Help pageTopics
Functions available in Arrow dplyr queriesacero arrow-dplyr arrow-functions arrow-verbs
Array ClassesArray DictionaryArray FixedSizeListArray LargeListArray ListArray MapArray StructArray
ArrayData classArrayData
Create an Arrow Arrayarrow_array
Report information on the package's capabilitiesarrow_available arrow_info arrow_with_acero arrow_with_dataset arrow_with_gcs arrow_with_json arrow_with_parquet arrow_with_s3 arrow_with_substrait
Create an Arrow Tablearrow_table
Convert an object to an Arrow Arrayas_arrow_array as_arrow_array.Array as_arrow_array.ChunkedArray as_arrow_array.Scalar
Convert an object to an Arrow Tableas_arrow_table as_arrow_table.arrow_dplyr_query as_arrow_table.data.frame as_arrow_table.Dataset as_arrow_table.default as_arrow_table.RecordBatch as_arrow_table.RecordBatchReader as_arrow_table.Schema as_arrow_table.Table
Convert an object to an Arrow ChunkedArrayas_chunked_array as_chunked_array.Array as_chunked_array.ChunkedArray
Convert an object to an Arrow DataTypeas_data_type as_data_type.DataType as_data_type.Field as_data_type.Schema
Convert an object to an Arrow RecordBatchas_record_batch as_record_batch.arrow_dplyr_query as_record_batch.data.frame as_record_batch.RecordBatch as_record_batch.Table
Convert an object to an Arrow RecordBatchReaderas_record_batch_reader as_record_batch_reader.arrow_dplyr_query as_record_batch_reader.data.frame as_record_batch_reader.Dataset as_record_batch_reader.function as_record_batch_reader.RecordBatch as_record_batch_reader.RecordBatchReader as_record_batch_reader.Scanner as_record_batch_reader.Table
Convert an object to an Arrow Schemaas_schema as_schema.Schema as_schema.StructType
Create a Bufferbuffer
Buffer classBuffer
Call an Arrow compute functioncall_function
Create a Chunked Arraychunked_array
ChunkedArray classChunkedArray
Compression Codec classCodec
Check whether a compression codec is availablecodec_is_available
Compressed stream classesCompressedInputStream CompressedOutputStream compression
Concatenate zero or more Arraysc.Array concat_arrays
Concatenate one or more Tablesconcat_tables
Copy files between FileSystemscopy_files
Manage the global CPU thread pool in libarrowcpu_count set_cpu_count
Create a source bundle that includes all thirdparty dependenciescreate_package_with_all_dependencies
CSV Convert Optionscsv_convert_options
CSV Parsing Optionscsv_parse_options
CSV Reading Optionscsv_read_options
CSV Writing Optionscsv_write_options
CSV dataset file formatCsvFileFormat
File reader optionsCsvConvertOptions CsvParseOptions CsvReadOptions CsvWriteOptions JsonParseOptions JsonReadOptions TimestampParser
Arrow CSV and JSON table reader classesCsvTableReader JsonTableReader
Create Arrow data typesbinary bool boolean data-type date32 date64 decimal decimal128 decimal256 duration FixedSizeListType fixed_size_binary fixed_size_list_of float float16 float32 float64 halffloat int16 int32 int64 int8 large_binary large_list_of large_utf8 list_of MapType map_of null string struct time32 time64 timestamp uint16 uint32 uint64 uint8 utf8
Multi-file datasetsDataset DatasetFactory FileSystemDataset FileSystemDatasetFactory InMemoryDataset UnionDataset
Create a DatasetFactorydataset_factory
DataType classDataType
Create a dictionary typedictionary
class DictionaryTypeDictionaryType
Arrow expressionsExpression
ExtensionArray classExtensionArray
ExtensionType classExtensionType
FeatherReader classFeatherReader
Create a Fieldfield
Field classField
Dataset file formatsFileFormat IpcFileFormat ParquetFileFormat
FileSystem entry infoFileInfo
file selectorFileSelector
FileSystem classesFileSystem GcsFileSystem LocalFileSystem S3FileSystem SubTreeFileSystem
Format-specific write optionsFileWriteOptions
FixedWidthType classFixedWidthType
Connect to a Flight serverflight_connect
Explicitly close a Flight clientflight_disconnect
Get data from a Flight serverflight_get
Send data to a Flight serverflight_put
Format-specific scan optionsCsvFragmentScanOptions FragmentScanOptions JsonFragmentScanOptions ParquetFragmentScanOptions
Connect to a Google Cloud Storage (GCS) bucketgs_bucket
Construct Hive partitioninghive_partition
Extract a schema from an objectinfer_schema
Infer the arrow Array type from an R objectinfer_type type
InputStream classesBufferReader InputStream MemoryMappedFile RandomAccessFile ReadableFile
Install or upgrade the Arrow libraryinstall_arrow
Install pyarrow for use with reticulateinstall_pyarrow
Manage the global I/O thread pool in libarrowio_thread_count set_io_thread_count
JSON dataset file formatJsonFileFormat
List available Arrow C++ compute functionslist_compute_functions
See available resources on a Flight serverflight_path_exists list_flights
Load a Python Flight serverload_flight_server
Apply a function to a stream of RecordBatchesmap_batches
Value matching for Arrow objectsis_in match_arrow
Message classMessage
MessageReader classMessageReader
Create a new read/write memory mapped file of a given sizemmap_create
Open a memory mapped filemmap_open
Extension typesnew_extension_array new_extension_type register_extension_type reregister_extension_type unregister_extension_type
Open a multi-file datasetopen_dataset
Open a multi-file dataset of CSV or other delimiter-separated formatopen_csv_dataset open_delim_dataset open_tsv_dataset
OutputStream classesBufferOutputStream FileOutputStream OutputStream
ParquetArrowReaderProperties classParquetArrowReaderProperties
ParquetFileReader classParquetFileReader
ParquetFileWriter classParquetFileWriter
ParquetReaderProperties classParquetReaderProperties
ParquetWriterProperties classParquetWriterProperties
Define Partitioning for a DatasetDirectoryPartitioning DirectoryPartitioningFactory HivePartitioning HivePartitioningFactory Partitioning
Read a CSV or other delimited file with Arrowread_csv2_arrow read_csv_arrow read_delim_arrow read_tsv_arrow
Read a Feather file (an Arrow IPC file)read_feather read_ipc_file
Read Arrow IPC stream formatread_ipc_stream
Read a JSON fileread_json_arrow
Read a Message from a streamread_message
Read a Parquet fileread_parquet
Read a Schema from a streamread_schema
Create a RecordBatchrecord_batch
RecordBatch classRecordBatch
RecordBatchReader classesRecordBatchFileReader RecordBatchReader RecordBatchStreamReader
RecordBatchWriter classesRecordBatchFileWriter RecordBatchStreamWriter RecordBatchWriter
Register user-defined functionsregister_scalar_function
Connect to an AWS S3 buckets3_bucket
Create an Arrow Scalarscalar StructScalar
Arrow scalarsScalar
Scan the contents of a datasetScanner ScannerBuilder
Create a schema or extract one from an object.schema
Schema classSchema
Show the details of an Arrow Execution Planshow_exec_plan
Table classTable
Create an Arrow object from a DuckDB connectionto_arrow
Create a (virtual) DuckDB table from an Arrow objectto_duckdb
Combine and harmonize schemasunify_schemas
'table' for Arrow objectsvalue_counts
Extension type for generic typed vectorsvctrs_extension_array vctrs_extension_type
Write CSV file to diskwrite_csv_arrow
Write a datasetwrite_dataset
Write a dataset into partitioned flat files.write_csv_dataset write_delim_dataset write_tsv_dataset
Write a Feather file (an Arrow IPC file)write_feather write_ipc_file
Write Arrow IPC stream formatwrite_ipc_stream
Write Parquet file to diskwrite_parquet
Write Arrow data to a raw vectorwrite_to_raw