Apache Avro Schema Example (in Java)

Updated: Sep 13, 2019

Introduction

  • Avro provides data serialization based on JSON Schema.

  • It is language neutral data serialization system, means a language A can serialize and languages B can de-serialize and use it.

  • Avro supports both dynamic and static types as per requirement.

  • It supports many languages like Java,C, C++, C#, Python and Ruby.


Benefits

  • Producers and consumers are decoupled from their change in application.

  • Schemas help future proof your data and make it more robust.

  • Supports and used in all use cases in streaming specially in Kafka.

  • Avro are compact and fast for streaming.

  • Supports for schema registry in case of Kafka.


Steps to Serialize Object

  • Create JSON schema.

  • Compile the schema in the application.

  • Populate the schema with data.

  • Serialize data using Avro serializer.



Steps to Deserialize Object

  • Use Apache Avro api to read the serialized file.

  • Populate the schema from file.

  • Use the object for application.



Sample Example for Avro (in Java)


Step-1: Create a Java project and add the dependencies as below.


Project Structure and Dependencies


Step-2: Create a Schema file as below:


Customer_v0.avsc


{

"namespace": "com.demo.avro",

"type": "record",

"name": "Customer",

"fields": [

{

"name": "id",

"type": "int"

},

{

"name": "name",

"type": "string"

},

{

"name": "faxNumber",

"type": [

"null",

"string"

],

"default": "null"

}

]

}


Step-3: Compile the schema.

java -jar lib\avro-tools-1.8.1.jar compile schema schema\Customer_v0.avsc schema


Step-4: Put the java generated file to the source directory of the project as shown in project structure.


Step-5: Create the Producer.java


package com.demo.producer;


import java.io.File;

import java.io.IOException;

import org.apache.avro.file.DataFileWriter;

import org.apache.avro.io.DatumWriter;

import org.apache.avro.specific.SpecificDatumWriter;


import com.demo.avro.Customer;


public class Producer {


public static void main(String[] args)throws IOException {

serailizeMessage();

}


public static void serailizeMessage()throws IOException{

DatumWriter<Customer> datumWriter = new SpecificDatumWriter<Customer>(Customer.class);

DataFileWriter<Customer> dataFileWriter = new DataFileWriter<Customer>(datumWriter);

File file = new File("customer.avro");

Customer customer=new Customer();

dataFileWriter.create(customer.getSchema(), file);

customer.setId(1001);

customer.setName("Customer -1");

customer.setFaxNumber("284747384343333".subSequence(0, 10));

dataFileWriter.append(customer);

customer=new Customer();

customer.setId(1002);

customer.setName("Customer -2");

customer.setFaxNumber("45454747384343333".subSequence(0, 10));

dataFileWriter.append(customer);