View By

Categories

 

Apache Avro Schema Example (in Java)

Updated: Sep 13, 2019

Introduction

  • Avro provides data serialization based on JSON Schema.

  • It is language neutral data serialization system, means a language A can serialize and languages B can de-serialize and use it.

  • Avro supports both dynamic and static types as per requirement.

  • It supports many languages like Java,C, C++, C#, Python and Ruby.


Benefits

  • Producers and consumers are decoupled from their change in application.

  • Schemas help future proof your data and make it more robust.

  • Supports and used in all use cases in streaming specially in Kafka.

  • Avro are compact and fast for streaming.

  • Supports for schema registry in case of Kafka.


Steps to Serialize Object

  • Create JSON schema.

  • Compile the schema in the application.

  • Populate the schema with data.

  • Serialize data using Avro serializer.



Steps to Deserialize Object

  • Use Apache Avro api to read the serialized file.

  • Populate the schema from file.

  • Use the object for application.



Sample Example for Avro (in Java)


Step-1: Create a Java project and add the dependencies as below.


Project Structure and Dependencies


Step-2: Create a Schema file as below:


Customer_v0.avsc


{

"namespace": "com.demo.avro",

"type": "record",

"name": "Customer",

"fields": [

{

"name": "id",

"type": "int"

},

{

"name": "name",

"type": "string"

},

{

"name": "faxNumber",

"type": [

"null",

"string"

],

"default": "null"

}

]

}


Step-3: Compile the schema.

java -jar lib\avro-tools-1.8.1.jar compile schema schema\Customer_v0.avsc schema


Step-4: Put the java generated file to the source directory of the project as shown in project structure.


Step-5: Create the Producer.java


package com.demo.producer;


import java.io.File;

import java.io.IOException;

import org.apache.avro.file.DataFileWriter;

import org.apache.avro.io.DatumWriter;

import org.apache.avro.specific.SpecificDatumWriter;


import com.demo.avro.Customer;


public class Producer {


public static void main(String[] args)throws IOException {

serailizeMessage();

}


public static void serailizeMessage()throws IOException{

DatumWriter<Customer> datumWriter = new SpecificDatumWriter<Customer>(Customer.class);

DataFileWriter<Customer> dataFileWriter = new DataFileWriter<Customer>(datumWriter);

File file = new File("customer.avro");

Customer customer=new Customer();

dataFileWriter.create(customer.getSchema(), file);

customer.setId(1001);

customer.setName("Customer -1");

customer.setFaxNumber("284747384343333".subSequence(0, 10));

dataFileWriter.append(customer);

customer=new Customer();

customer.setId(1002);

customer.setName("Customer -2");

customer.setFaxNumber("45454747384343333".subSequence(0, 10));

dataFileWriter.append(customer);

dataFileWriter.close();

}

}


Step-6: Create the Consumer.java


package com.demo.consumer;

import java.io.File;

import java.io.IOException;

import org.apache.avro.file.DataFileReader;

import org.apache.avro.io.DatumReader;

import org.apache.avro.specific.SpecificDatumReader;

import com.demo.avro.Customer;


public class Consumer {


public static void main(String[] args)throws IOException {

deSerailizeMessage();

}

public static void deSerailizeMessage()throws IOException{

File file = new File("customer.avro");

DatumReader<Customer> datumReader = new SpecificDatumReader<Customer>(Customer.class);

DataFileReader<Customer> dataFileReader= new DataFileReader<Customer>(file,datumReader);

Customer customer=null;

while(dataFileReader.hasNext()){

customer=dataFileReader.next(customer);

System.out.println(customer);

}

}

}


Step-7: Run Producer.java

It creates customer.avro file and puts the customer in Avro format.


Step-8: Run Consumer.java

It reads the customer.avro file and get the customer records.


Thank you! If you have any question please mention in comments section below.


[12/09/2019 10:38 PM CST - Reviewed by: PriSin]


Help others, write your first blog today! 

Home   |   Contact Us

©2020 by Data Nebulae