All Versions
37
Latest Version
Avg Release Cycle
234 days
Latest Release
-

Changelog History
Page 2

  • v1.8.0 Changes

    July 12, 2015

    ๐Ÿ› Bug

    • ๐Ÿ”€ PARQUET-151 - Null Pointer exception in parquet.hadoop.ParquetFileWriter.mergeFooters
    • ๐Ÿ›  PARQUET-152 - Encoding issue with fixed length byte arrays
    • PARQUET-164 - Warn when parquet memory manager kicks in
    • PARQUET-199 - Add a callback when the MemoryManager adjusts row group size
    • PARQUET-201 - Column with OriginalType INT_8 failed at filtering
    • PARQUET-227 - Parquet thrift can write unions that have 0 or more than 1 set value
    • PARQUET-246 - ArrayIndexOutOfBoundsException with Parquet write version v2
    • PARQUET-251 - Binary column statistics error when reuse byte[] among rows
    • ๐Ÿ‘ PARQUET-252 - parquet scrooge support should support nested container type
    • ๐Ÿ‘ป PARQUET-254 - Wrong exception message for unsupported INT96 type
    • โช PARQUET-269 - Restore scrooge-maven-plugin to 3.17.0 or greater
    • ๐Ÿ“‡ PARQUET-284 - Should use ConcurrentHashMap instead of HashMap in ParquetMetadataConverter
    • PARQUET-285 - Implement nested types write rules in parquet-avro
    • PARQUET-287 - Projecting unions in thrift causes TExceptions in deserializatoin
    • PARQUET-296 - Set master branch version back to 1.8.0-SNAPSHOT
    • PARQUET-297 - created_by in file meta data doesn't contain parquet library version
    • PARQUET-314 - Fix broken equals implementation(s)
    • PARQUET-316 - Run.sh is broken in parquet-benchmarks
    • ๐Ÿ“‡ PARQUET-317 - writeMetaDataFile crashes when a relative root Path is used
    • โช PARQUET-320 - Restore semver checks
    • PARQUET-324 - row count incorrect if data file has more than 231 rows
    • PARQUET-325 - Do not target row group sizes if padding is set to 0
    • PARQUET-329 - ThriftReadSupport#THRIFT_COLUMN_FILTER_KEY was removed (incompatible change)

    ๐Ÿ‘Œ Improvement

    • PARQUET-175 - Allow setting of a custom protobuf class when reading parquet file using parquet-protobuf.
    • PARQUET-223 - Add Map and List builiders
    • ๐Ÿ— PARQUET-245 - Travis CI runs tests even if build fails
    • PARQUET-248 - Simplify ParquetWriters's constructors
    • PARQUET-253 - AvroSchemaConverter has confusing Javadoc
    • ๐Ÿ‘ท PARQUET-259 - Support Travis CI in parquet-cpp
    • โšก๏ธ PARQUET-264 - Update README docs for graduation
    • ๐Ÿ‘ PARQUET-266 - Add support for lists of primitives to Pig schema converter
    • โšก๏ธ PARQUET-272 - Updates docs decscription to match data model
    • โšก๏ธ PARQUET-274 - Updates URLs to link against the apache user instead of Parquet on github
    • โšก๏ธ PARQUET-276 - Updates CONTRIBUTING file with new repo info
    • PARQUET-286 - Avro object model should use Utf8
    • ๐Ÿ‘ PARQUET-288 - Add dictionary support to Avro converters
    • ๐Ÿ— PARQUET-289 - Allow object models to extend the ParquetReader builders
    • ๐Ÿ— PARQUET-290 - Add Avro data model to the reader builder
    • PARQUET-306 - Improve alignment between row groups and HDFS blocks
    • PARQUET-308 - Add accessor to ParquetWriter to get current data size
    • ๐Ÿšš PARQUET-309 - Remove unnecessary compile dependency on parquet-generator
    • 0๏ธโƒฃ PARQUET-321 - Set the HDFS padding default to 8MB
    • PARQUET-327 - Show statistics in the dump output

    ๐Ÿ†• New Feature

    • PARQUET-229 - Make an alternate, stricter thrift column projection API
    • ๐Ÿ‘ PARQUET-243 - Add avro-reflect support

    Task

    • ๐Ÿš€ PARQUET-262 - When 1.7.0 is released, restore semver plugin config
    • ๐Ÿš€ PARQUET-292 - Release Parquet 1.8.0
  • v1.7.0 Changes

  • v1.6.0 Changes

    Bug

    • ๐Ÿ”€ PARQUET-3 - tool to merge pull requests based on Spark
    • PARQUET-4 - Use LRU caching for footers in ParquetInputFormat.
    • PARQUET-8 - [parquet-scrooge] mvn eclipse:eclipse fails on parquet-scrooge
    • PARQUET-9 - InternalParquetRecordReader will not read multiple blocks when filtering
    • PARQUET-18 - Cannot read dictionary-encoded pages with all null values
    • PARQUET-19 - NPE when an empty file is included in a Hive query that uses CombineHiveInputFormat
    • ๐Ÿ“„ PARQUET-21 - Fix reference to 'github-apache' in dev docs
    • PARQUET-56 - Added an accessor for the Long column type in example Group
    • PARQUET-62 - DictionaryValuesWriter dictionaries are corrupted by user changes.
    • ๐Ÿ›  PARQUET-63 - Fixed-length columns cannot be dictionary encoded.
    • โš  PARQUET-66 - InternalParquetRecordWriter int overflow causes unnecessary memory check warning
    • PARQUET-69 - Add committer doc and REVIEWERS files
    • PARQUET-70 - PARQUET #36: Pig Schema Storage to UDFContext
    • PARQUET-75 - String decode using 'new String' is slow
    • โฌ†๏ธ PARQUET-80 - upgrade semver plugin version to 0.9.27
    • โœ… PARQUET-82 - ColumnChunkPageWriteStore assumes pages are smaller than Integer.MAX_VALUE
    • PARQUET-88 - Fix pre-version enforcement.
    • PARQUET-94 - ParquetScroogeScheme constructor ignores klass argument
    • PARQUET-96 - parquet.example.data.Group is missing some methods
    • ๐Ÿ— PARQUET-97 - ProtoParquetReader builder factory method not static
    • ๐Ÿ“‡ PARQUET-101 - Exception when reading data with parquet.task.side.metadata=false
    • PARQUET-104 - Parquet writes empty Rowgroup at the end of the file
    • PARQUET-106 - Relax InputSplit Protections
    • ๐Ÿ“‡ PARQUET-107 - Add option to disable summary metadata aggregation after MR jobs
    • PARQUET-114 - Sample NanoTime class serializes and deserializes Timestamp incorrectly
    • ๐Ÿ“‡ PARQUET-122 - make parquet.task.side.metadata=true by default
    • ๐Ÿ‘ท PARQUET-124 - parquet.hadoop.ParquetOutputCommitter.commitJob() throws parquet.io.ParquetEncodingException
    • PARQUET-132 - AvroParquetInputFormat should use a parameterized type
    • PARQUET-135 - Input location is not getting set for the getStatistics in ParquetLoader when using two different loaders within a Pig script.
    • PARQUET-136 - NPE thrown in StatisticsFilter when all values in a string/binary column trunk are null
    • PARQUET-142 - parquet-tools doesn't filter _SUCCESS file
    • ๐Ÿ‘ป PARQUET-145 - InternalParquetRecordReader.close() should not throw an exception if initialization has failed
    • ๐Ÿ”€ PARQUET-150 - Merge script requires ':' in PR names
    • ๐ŸŒฒ PARQUET-157 - Divide by zero in logging code
    • โœ… PARQUET-159 - paquet-hadoop tests fail to compile
    • PARQUET-162 - ParquetThrift should throw when unrecognized columns are passed to the column projection API
    • ๐Ÿ’ป PARQUET-168 - Wrong command line option description in parquet-tools
    • PARQUET-173 - StatisticsFilter doesn't handle And properly
    • PARQUET-174 - Fix Java6 compatibility
    • ๐Ÿ“œ PARQUET-176 - Parquet fails to parse schema contains '\r'
    • PARQUET-180 - Parquet-thrift compile issue with 0.9.2.
    • ๐Ÿ“š PARQUET-184 - Add release scripts and documentation
    • ๐ŸŽ PARQUET-186 - Poor performance in SnappyCodec because of string concat in tight loop
    • PARQUET-187 - parquet-scrooge doesn't compile under 2.11
    • PARQUET-188 - Parquet writes columns out of order (compared to the schema)
    • ๐Ÿ— PARQUET-189 - Support building parquet with thrift 0.9.0
    • PARQUET-196 - parquet-tools command to get rowcount & size
    • ๐Ÿ“‡ PARQUET-197 - parquet-cascading and the mapred API does not create metadata file
    • PARQUET-202 - Typo in the connection info in the pom prevents publishing an RC
    • PARQUET-207 - ParquetInputSplit end calculation bug
    • โช PARQUET-208 - revert PARQUET-197
    • PARQUET-214 - Avro: Regression caused by schema handling
    • PARQUET-215 - Parquet Thrift should discard records with unrecognized union members
    • 0๏ธโƒฃ PARQUET-216 - Decrease the default page size to 64k
    • PARQUET-217 - Memory Manager's min allocation heuristic is not valid for schemas with many columns
    • PARQUET-232 - minor compilation issue
    • โช PARQUET-234 - Restore ParquetInputSplit methods from 1.5.0
    • ๐Ÿ“‡ PARQUET-235 - Fix compatibility of parquet.metadata with 1.5.0
    • PARQUET-236 - Check parquet-scrooge compatibility
    • PARQUET-237 - Check ParquetWriter constructor compatibility with 1.5.0
    • ๐Ÿ— PARQUET-239 - Make AvroParquetReader#builder() static
    • ๐Ÿ‘ PARQUET-242 - AvroReadSupport.setAvroDataSupplier is broken

    Improvement

    • PARQUET-2 - Adding Type Persuasion for Primitive Types
    • PARQUET-25 - Pushdown predicates only work with hardcoded arguments
    • PARQUET-52 - Improve the encoding fall back mechanism for Parquet 2.0
    • PARQUET-57 - Make dev commit script easier to use
    • PARQUET-61 - Avoid fixing protocol events when there is not required field missing
    • PARQUET-74 - Use thread local decoder cache in Binary toStringUsingUTF8()
    • ๐Ÿ“‡ PARQUET-79 - Add thrift streaming API to read metadata
    • ๐Ÿ“‡ PARQUET-84 - Add an option to read the rowgroup metadata on the task side.
    • ๐Ÿ‘ PARQUET-87 - Better and unified API for projection pushdown on cascading scheme
    • ๐Ÿ‘ท PARQUET-89 - All Parquet CI tests should be run against hadoop-2
    • PARQUET-92 - Parallel Footer Read Control
    • ๐Ÿ”จ PARQUET-105 - Refactor and Document Parquet Tools
    • PARQUET-108 - Parquet Memory Management in Java
    • PARQUET-115 - Pass a filter object to user defined predicate in filter2 api
    • PARQUET-116 - Pass a filter object to user defined predicate in filter2 api
    • PARQUET-117 - implement the new page format for Parquet 2.0
    • ๐Ÿ“‡ PARQUET-119 - add data_encodings to ColumnMetaData to enable dictionary based predicate push down
    • ๐Ÿ— PARQUET-121 - Allow Parquet to build with Java 8
    • โšก๏ธ PARQUET-128 - Optimize the parquet RecordReader implementation when: A. filterpredicate is pushed down , B. filterpredicate is pushed down on a flat schema
    • โฌ†๏ธ PARQUET-133 - Upgrade snappy-java to 1.1.1.6
    • โœจ PARQUET-134 - Enhance ParquetWriter with file creation flag
    • PARQUET-140 - Allow clients to control the GenericData object that is used to read Avro records
    • PARQUET-141 - improve parquet scrooge integration
    • PARQUET-160 - Simplify CapacityByteArrayOutputStream
    • PARQUET-165 - A benchmark module for Parquet would be nice
    • PARQUET-177 - MemoryManager ensure minimum Column Chunk size
    • ๐Ÿ‘ PARQUET-181 - Scrooge Write Support
    • PARQUET-191 - Avro schema conversion incorrectly converts maps with nullable values.
    • PARQUET-192 - Avro maps drop null values
    • PARQUET-193 - Avro: Implement read compatibility rules for nested types
    • PARQUET-203 - Consolidate PathFilter for hidden files
    • ๐Ÿ‘ PARQUET-204 - Directory support for parquet-schema
    • PARQUET-210 - JSON output for parquet-cat

    New Feature

    • PARQUET-22 - Parquet #13: Backport of HIVE-6938
    • ๐Ÿ‘ PARQUET-49 - Create a new filter API that supports filtering groups of records based on their statistics
    • PARQUET-64 - Add new logical types to parquet-column
    • ๐Ÿ‘ PARQUET-123 - Add dictionary support to AvroIndexedRecordReader
    • PARQUET-198 - parquet-cascading Add Parquet Avro Scheme

    Task

    • ๐Ÿšš PARQUET-50 - Remove items from semver blacklist
    • PARQUET-139 - Avoid reading file footers in parquet-avro InputFormat
    • ๐Ÿ‘ PARQUET-190 - Fix an inconsistent Javadoc comment of ReadSupport.prepareForRead
    • ๐Ÿ— PARQUET-230 - Add build instructions to the README
  • v1.5.0 Changes

    • โœ… ISSUE 399: Fixed resetting stats after writePage bug, unit testing of readFooter
    • ๐Ÿ›  ISSUE 397: Fixed issue with column pruning when using requested schema
    • ISSUE 389: Added padding for requested columns not found in file schema
    • ๐Ÿ›  ISSUE 392: Value stats fixes
    • ISSUE 338: Added statistics to Parquet pages and rowGroups
    • ๐Ÿ›  ISSUE 351: Fix bug #350, fixed length argument out of order.
    • ๐Ÿ”ง ISSUE 378: configure semver to enforce semantic versioning
    • ๐Ÿ‘ ISSUE 355: Add support for DECIMAL type annotation.
    • ISSUE 336: protobuf dependency version changed from 2.4.1 to 2.5.0
    • ๐Ÿšš ISSUE 337: issue #324, move ParquetStringInspector to org.apache.hadoop.hive.serde...
  • v1.4.3 Changes

    • ๐Ÿ“‡ ISSUE 381: fix metadata concurency problem
  • v1.4.2 Changes

    • ISSUE 359: Expose values in SimpleRecord
    • ISSUE 335: issue #290, hive map conversion to parquet schema
    • ISSUE 365: generate splits by min max size, and align to HDFS block when possible
    • ISSUE 353: Fix bug: optional enum field causing ScroogeSchemaConverter to fail
    • ISSUE 362: Fix output bug during parquet-dump command
    • ISSUE 366: do not call schema converter to generate projected schema when projection is not set
    • ISSUE 367: make ParquetFileWriter throw IOException in invalid state case
    • ISSUE 352: Parquet thrift storer
    • ISSUE 349: fix header bug
  • v1.4.1 Changes

    • ๐Ÿ‘ป ISSUE 344: select * from parquet hive table containing map columns runs into exception. Issue #341.
    • ๐Ÿ‘ ISSUE 347: set reading length in ThriftBytesWriteSupport to avoid potential OOM cau...
    • ISSUE 346: stop using strings and b64 for compressed input splits
    • ISSUE 345: set cascading version to 2.5.3
    • ISSUE 342: compress kv pairs in ParquetInputSplits
  • v1.4.0 Changes

    • ISSUE 333: Compress schemas in split
    • ISSUE 329: fix filesystem resolution
    • ISSUE 320: Spelling fix
    • ISSUE 319: oauth based authentication; fix grep change
    • ๐Ÿ”€ ISSUE 310: Merge parquet tools
    • ISSUE 314: Fix avro schema conv for arrays of optional type for #312.
    • 0๏ธโƒฃ ISSUE 311: Avro null default values bug
    • โšก๏ธ ISSUE 316: Update poms to use thrift.exectuable property.
    • ISSUE 285: [CASCADING] Provide the sink implementation for ParquetTupleScheme
    • ๐Ÿ‘ ISSUE 264: Native Protocol Buffer support
    • ๐Ÿ‘ ISSUE 293: Int96 support
    • ๐Ÿ”ง ISSUE 313: Add hadoop Configuration to Avro and Thrift writers (#295).
    • ISSUE 262: Scrooge schema converter and projection pushdown in Scrooge
    • ISSUE 297: Ports HIVE-5783 to the parquet-hive module
    • ISSUE 303: Avro read schema aliases
    • 0๏ธโƒฃ ISSUE 299: Fill in default values for new fields in the Avro read schema
    • ๐Ÿ›  ISSUE 298: Bugfix reorder thrift fields causing writting nulls
    • ISSUE 289: first use current thread's classloader to load a class, if current threa...
    • ๐Ÿ”ง ISSUE 292: Added ParquetWriter() that takes an instance of Hadoop's Configuration.
    • 0๏ธโƒฃ ISSUE 282: Avro default read schema
    • ๐Ÿ’… ISSUE 280: style: junit.framework to org.junit
    • ISSUE 270: Make ParquetInputSplit extend FileSplit
  • v1.3.2 Changes

    • ISSUE 271: fix bug: last enum index throws DecodingSchemaMismatchException
    • ๐Ÿ— ISSUE 268: fixes #265: add semver validation checks to non-bundle builds
    • ISSUE 269: Bumps parquet-jackson parent version
    • ISSUE 260: Shade jackson only once for all parquet modules
  • v1.3.1 Changes

    • ๐Ÿ‘ป ISSUE 267: handler only handle ignored field, exception during will be thrown as Sk...
    • โฌ†๏ธ ISSUE 266: upgrade parquet-mr to elephant-bird 4.4