81 results found for ""

  • Kafka Consumer Advance (Java example)

    Prerequisite Kafka Overview Kafka Producer & Consumer Commits and Offset in Kafka Consumer Once client commits the message, Kafka marks the message "deleted" for the consumer and hence the read message would be available in next poll by the client. Properties used in the below example bootstrap.servers=localhost:9092 ProducerConfig.RETRIES_CONFIG=0 value.deserializer=org.apache.kafka.common.serialization.StringDeserializer key.deserializer=org.apache.kafka.common.serialization.StringDeserializer retries=0 group.id=group1 HQ_TOPIC_NAME=EK.TWEETS.TOPIC CONSUMER_TIMEOUT=1000 worker.thread.count=5 counsumer.count=3 auto.offset.reset=earliest enable.auto.commit=false Configuration Level Setting This can be done at configuration level in the properties files. auto.commit.offset=false - This is the default setting. Means the consumer API can take the decision to retail the message of the offset or commit it. auto.commit.offset=true - Once the message is consumed by the consumer, the offset is committed if consumer API is not taking any decision in client code. Consumer API Level Setting Synchronous Commit Offset is committed as soon consumer API confirms. The latest Offset of the message is committed. Below example is committing the message after processing all messages of the current polling. Synchronous commit blocks until the broker responds to the commit request. Sample Code public synchronized void subscribeMessage(String configPropsFile)throws Exception{ try{ if(consumer==null){ consumer =(KafkaConsumer) getKafkaConnection(configPropsFile); System.out.println("Kafka Connection created...on TOPIC : "+getTopicName()); } consumer.subscribe(Collections.singletonList(getTopicName())); while (true) { ConsumerRecords records = consumer.poll(10000L); for (ConsumerRecord record : records) { System.out.printf("Received Message topic =%s, partition =%s, offset = %d, key = %s, value = %s\n", record.topic(), record.partition(), record.offset(), record.key(), record.value()); } consumer.commitSync(); } }catch(Exception e){ e.printStackTrace(); consumer.close(); } } Asynchronous Commit The consumer does not wait for the the response from the broker This commits just confirms the broker and continue its processing. Throughput is more in compare to Synchronous commit. There could be chances of duplicate read, that application need to handle its own. Sample code while (true) { ConsumerRecords records = consumer.poll(10000L); System.out.println("Number of messaged polled by consumer "+records.count()); for (ConsumerRecord record : records) { System.out.printf("Received Message topic =%s, partition =%s, offset = %d, key = %s, value = %s\n", record.topic(), record.partition(), record.offset(), record.key(), record.value()); } consumer.commitAsync(new OffsetCommitCallback() { public void onComplete(Map offsets, Exception exception) { if (exception != null){ System.out.printf("Commit failed for offsets {}", offsets, exception); }else{ System.out.println("Messages are Committed Asynchronously..."); } }}); } Offset Level Commit Sometime application may need to commit the offset on read of particular offset. Sample Code Map currentOffsets =new HashMap(); while (true) { ConsumerRecords records = consumer.poll(1000L); for (ConsumerRecord record : records) { System.out.printf("Received Message topic =%s, partition =%s, offset = %d, key = %s, value = %s\n", record.topic(), record.partition(), record.offset(), record.key(), record.value()); currentOffsets.put(new TopicPartition(record.topic(), record.partition()), new OffsetAndMetadata(record.offset()+1, "no metadata")); if(record.offset()==18098){ consumer.commitAsync(currentOffsets, null); } } } Retention of Message Kafka retains the message till the retention period defined in the configuration. It can be defined at broker level or at topic level. Retention of message can be on time basis or byte basis for the topic. Retention defined on Topic level override the retention defined at broker level. retention.bytes - The amount of messages, in bytes, to retain for this topic. retention.ms - How long messages should be retained for this topic, in milliseconds. 1. Defining retention at topic level Retention for the topic named “test-topic” to 1 hour (3,600,000 ms): # kafka-configs.sh --zookeeper localhost:2181/kafka-cluster --alter --entity-type topics --entity-name test-topic --add-config retention.ms=3600000 2. Defining retention at broker level Define one of the below properties in server.properties # Configures retention time in milliseconds => log.retention.ms=1680000 # Configures retention time in minutes => log.retention.minutes=1680 # Configures retention time in hours => log.retention.hours=168 Fetching Message From A Specific Offset Consumer can go down before committing the message and subsequently there can be message loss. Since Kafka broker has capability to retain the message for long time. Consumer can point to specific offset to get the message. Consumer can go back from current offset to particular offset or can start polling the message from beginning. Sample Code Map currentOffsets =new HashMap(); public synchronized void subscribeMessage(String configPropsFile)throws Exception{ try{ if(consumer==null){ consumer =(KafkaConsumer) getKafkaConnection(configPropsFile); System.out.println("Kafka Connection created...on TOPIC : "+getTopicName()); } TopicPartition topicPartition = new TopicPartition(getTopicName(), 0); List topics = Arrays.asList(topicPartition); consumer.assign(topics); consumer.seekToEnd(topics); long current = consumer.position(topicPartition); consumer.seek(topicPartition, current-10); System.out.println("Topic partitions are "+consumer.assignment()); while (true) { ConsumerRecords records = consumer.poll(10000L); System.out.println("Number of record polled "+records.count()); for (ConsumerRecord record : records) { System.out.printf("Received Message topic =%s, partition =%s, offset = %d, key = %s, value = %s\n", record.topic(), record.partition(), record.offset(), record.key(), record.value()); currentOffsets.put(new TopicPartition(record.topic(), record.partition()), new OffsetAndMetadata(record.offset()+1, "no metadata")); } consumer.commitAsync(currentOffsets, null); } }catch(Exception e){ e.printStackTrace(); consumer.close(); } } Thank you. If you have any doubt please feel free to post your questions in comments section below. [23/09/2019 04:38 PM CST - Reviewed by: PriSin]

  • ELK stack Installation on OEL (Oracle Enterprise Linux)

    Refer my previous blog to install Oracle Enterprise Linux operating system on your machine. Or if you have any operating system which supports Linux kernel like CentOS, Ubuntu, RedHat Linux etc, these steps will be similar. Navigation Menu: Introduction to ELK Stack Installation Loading data into Elasticsearch with Logstash Create Kibana Dashboard Example Kibana GeoIP Dashboard Example Elasticsearch Installation Before we start Elasticsearch installation. I hope you all have Java installed on your machine, if not please refer this. Now once you have installed Java successfully, go to this link and download latest version of Elasticsearch. https://www.elastic.co/downloads/ I have downloaded TAR file (elasticsearch-6.2.4.tar.gz) to explain this blog. For machines with GUI like CentOS, Ubuntu: Once you download it on your local machine, move it to your Linux environment where you want to run Elasticsearch. I use MobaXterm (open source tool) to transfer file from my windows machine to Linux environment (Red Hat Linux client without GUI in this case). For non-GUI Linux machines: Simply run wget on your Linux machine (if you don't have wget package installed on your machine, run this command with root user to install wget: yum install wget -y). Run below commands to install Elasticsearch with any user except root. Change the version according to your requirement, like I removed 6.2.4 for simplicity. wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz tar -xvzf elasticsearch-6.2.4.tar.gz rm -f elasticsearch-6.2.4.tar.gz mv elasticsearch-6.2.4 elasticsearch Start Elasticsearch To start Elasticsearch, navigate to Elasticsearch directory and launch elasticsearch. cd elasticsearch/ ./bin/elasticsearch Running Elasticsearch in Background You can start Elasticsearch in background as well with below commands. Run nohup and disown the process. Later you can find out the java process running on your machine or you can simply note down the PID which generates after executing nohup. Like in below case - 25605 is the PID. [hadoop@elasticsearch elasticsearch]$ nohup ./bin/elasticsearch & [1] 25605 [hadoop@elasticsearch elasticsearch]$ nohup: ignoring input and appending output to ‘nohup.out’ disown [hadoop@elasticsearch elasticsearch]$ ps -aux | grep java hadoop 25605 226 6.1 4678080 1257552 pts/0 Sl 11:54 0:31 /usr/java/java/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch.zbtKhO5i -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/home/hadoop/apps/installers/elasticsearch -Des.path.conf=/home/hadoop/apps/installers/elasticsearch/config -cp /home/hadoop/apps/installers/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch Note: If you are getting below error, please make sure you are not logged in as root. Remove the file, login with different user and redo above steps. Remember I told to install elasticsearch with any user except root. Error:java.nio.file.AccessDeniedException: /home/hadoop/apps/installers/elasticsearch/config/jvm.options Verify Elasticsearch installation [hadoop@localhost etc]$ curl http://localhost:9200 { "name" : "akY11V_", "cluster_name" : "elasticsearch", "cluster_uuid" : "3O3dLMIDRYmJa1zrqNZqug", "version" : { "number" : "6.2.4", "build_hash" : "ccec39f", "build_date" : "2018-04-12T20:37:28.497551Z", "build_snapshot" : false, "lucene_version" : "7.2.1", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" } Or you can simply open http://localhost:9200 in your local browser if your operating system supports any GUI. Kibana Installation Follow similar steps to download Kibana latest release from below link: https://www.elastic.co/downloads/kibana Move the TAR file to your Linux machine or simply run wget to download the file. Modify the version according to your requirement. wget https://artifacts.elastic.co/downloads/kibana/kibana-6.2.4-linux-x86_64.tar.gz tar -xvzf kibana-6.2.4-linux-x86_64.tar.gz rm -f kibana-6.2.4-linux-x86_64.tar.gz mv kibana-6.2.4-linux-x86_64 kibana Now, uncomment this line in kibana.yml file: elasticsearch.url: "http://localhost:9200" cd kibana vi /config/kibana.yml Start Kibana [hadoop@localhost kibana]$ ./bin/kibana log [21:09:12.958] [info][status][plugin:kibana@6.2.4] Status changed from uninitialized to green - Ready log [21:09:13.091] [info][status][plugin:elasticsearch@6.2.4] Status changed from uninitialized to yellow - Waiting for Elasticsearch log [21:09:13.539] [info][status][plugin:timelion@6.2.4] Status changed from uninitialized to green - Ready log [21:09:13.560] [info][status][plugin:console@6.2.4] Status changed from uninitialized to green - Ready log [21:09:13.573] [info][status][plugin:metrics@6.2.4] Status changed from uninitialized to green - Ready log [21:09:13.637] [info][listening] Server running at http://localhost:5601 log [21:09:13.758] [info][status][plugin:elasticsearch@6.2.4] Status changed from yellow to green - Ready You can start Kibana in background as well by executing below command: [hadoop@elasticsearch kibana]$ ./bin/kibana & [2] 23866 [hadoop@elasticsearch kibana]$ log [15:30:26.029] [info][status][plugin:kibana@6.2.4] Status changed from uninitialized to green - Ready log [15:30:26.164] [info][status][plugin:elasticsearch@6.2.4] Status changed from uninitialized to yellow - Waiting for Elasticsearch log [15:30:26.676] [info][status][plugin:timelion@6.2.4] Status changed from uninitialized to green - Ready log [15:30:26.701] [info][status][plugin:console@6.2.4] Status changed from uninitialized to green - Ready log [15:30:26.718] [info][status][plugin:metrics@6.2.4] Status changed from uninitialized to green - Ready log [15:30:26.781] [info][listening] Server running at http://localhost:5601 log [15:30:26.861] [info][status][plugin:elasticsearch@6.2.4] Status changed from yellow to green - Ready disown Logstash Installation Follow similar steps to download Logstash latest release from below link: https://www.elastic.co/downloads/logstash Or run the below commands with wget to download and install: wget https://artifacts.elastic.co/downloads/logstash/logstash-6.2.4.tar.gz tar -xvzf logstash-6.2.4.tar.gz rm -f logstash-6.2.4.tar.gz mv logstash-6.2.4 logstash Create config-sample file cd /logstash/config vi logstash-simple.conf input { stdin { } } output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug } } Start Logstash [hadoop@localhost logstash]$ ./bin/logstash -f ./config/logstash-simple.conf Sending Logstash's logs to /home/hadoop/apps/installers/logstash/logs which is now configured via log4j2.properties [2018-05-25T17:29:34,107][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/home/hadoop/apps/installers/logstash/modules/fb_apache/configuration"} [2018-05-25T17:29:34,150][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/home/hadoop/apps/installers/logstash/modules/netflow/configuration"} [2018-05-25T17:29:34,385][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/home/hadoop/apps/installers/logstash/data/queue"} [2018-05-25T17:29:34,396][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/home/hadoop/apps/installers/logstash/data/dead_letter_queue"} [2018-05-25T17:29:35,467][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified [2018-05-25T17:29:35,554][INFO ][logstash.agent ] No persistent UUID file found. Generating new UUID {:uuid=>"1aad4d0b-71ea-4355-8c21-9623927af557", :path=>"/home/hadoop/apps/installers/logstash/data/uuid"} [2018-05-25T17:29:37,391][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.2.4"} [2018-05-25T17:29:38,775][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2018-05-25T17:29:48,843][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50} [2018-05-25T17:29:50,008][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [2018-05-25T17:29:50,030][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"} [2018-05-25T17:29:50,614][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"} [2018-05-25T17:29:50,781][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6} [2018-05-25T17:29:50,789][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6} [2018-05-25T17:29:50,834][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil} [2018-05-25T17:29:50,873][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [2018-05-25T17:29:50,963][INFO ][logstash.outputs.elasticsearch] Installing elasticsearch template to _template/logstash [2018-05-25T17:29:51,421][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]} [2018-05-25T17:29:51,646][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#"} The stdin plugin is now waiting for input: [2018-05-25T17:29:51,902][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]} hello world { "@version" => "1", "@timestamp" => 2018-05-25T21:34:46.148Z, "host" => "localhost.localdomain", "message" => "hello world" } Accessing Kibana dashboard In order to access Kibana dashboard remotely, configure the file kibana.yml in /kibana/config directory to server.host: "0.0.0.0" as highlighted below. vi kibana.yml Now try opening the link on your local browser. http://{your machine ip}:5601 Note: If link doesn't work, try to stop firewall services on your server. Run below commands: service firewalld stop service iptables stop Here is the sample, In my case it's http://192.16x.x.xxx:5601 If you don't have your linux machine details. You can search your ipaddress by running ifconfig on your machine (inet is your ip). I hope you enjoyed this post. Please comment below if you have any question. Thank you! Next: Loading data into Elasticsearch with Logstash Navigation Menu: Introduction to ELK Stack Installation Loading data into Elasticsearch with Logstash Create Kibana Dashboard Example Kibana GeoIP Dashboard Example #InstallElasticsearch #Installation #ELKStack #Elasticsearch #Kibana #Logstash

  • Permanent Commission in Indian Army

    In Indian Army, a cadet upon completion of training is commission into service in either Permanent Commission (PC) or Short Service Commission (SSC). In addition to the initial contract of service, there are several other differences between these two commissions. We will discuss these differences in subsequent article; here I present a brief about the Permanent commissions and various entries which are part of this and academy at which cadets are trained before commissioning. A Permanent Commission means a career in the Army till you retire. For Permanent Commission, you have to join Indian Military Academy, Dehradun or Officers Training Academy, Gaya. a) Indian Military Academy, Dehradun Indian Military Academy is a cradle of leadership. The IMA trains you to lead from the front. You are trained in all aspects of combat and tactics using technology and other modern tools and technologies. The IMA has excellent facilities for all-round development. At the I.M.A Army Cadets are known as Gentlemen Cadets and are given strenuous military training for a period of one year. On successful completion of training Gentlemen Cadets are granted Permanent Commission in the rank of Lieutenant subject to being medically fit in "SHAPE-I” one. Gentlemen Cadets during the entire duration of training in Service Academies i.e. during training period at IMA shall be entitled to stipend of Rs 21,000/- p.m. From the IMA, you're commissioned as a "Lieutenant" in the Indian Army, to go out into the world and live up to the IMA motto - "Valour & Wisdom". The main entries to get into IMA are as under: 1. National Defence Academy, Pune The cadets for the three services, viz, Army, Navy and Air force are given preliminary training in both academic and physical for a period of 3 Years at the National Defence Academy(Pune), which is an Inter-Service Institution. The training during the first two and half years is common to the cadets of three wings. The cadets on passing out will be awarded B.Sc./B.Sc.(Computer)/BA. Degree from Jawaharlal Nehru University, Delhi. On passing out from the National Defence Academy, Army Cadets go to the Indian Military Academy, Dehradun for training and commissioning. 2. Combined Defence Service Examination (CDSE) You can take the CDS Entrance Exam conducted by UPSC while you are in final year of Graduation / possess any Graduate Degree. Clear the SSB interview, be medically fit and join IMA as a Direct Entry subject to meeting all eligibility conditions and availability of vacancies. (For details of exam dates / Notification visit UPSC website) Candidates finally selected will undergo Military Training for a period of 18 Months at IMA. Candidates will be enrolled under the Army Act as Gentlemen Cadets. On successful completion of training gentlemen cadets are granted permanent commission in the rank of Lt subject to being medically fit in SHAPE – I. The final allocation of Arms/Services will be made prior to passing out of Gentlemen Cadets from IMA. The other entries are Non UPSC entries and there is no written exam for them. You are directly called for SSB interview and details are as under: 3. University Entry Scheme (Pre Final Year Students Only) This entry is for those of you who wish to apply for Army in Pre-Final Year of Engineering. Look out for the advertisement published in leading newspapers/employment news in May every year. Selected candidates for UES Course will be detailed for training at Indian Military Academy, Dehradun according to their position in the final order of merit, discipline wise up-to the number of vacancies available at that time. Duration of training is one year. Candidates during the period of training will be given stipend at the rate of minimum basic pay of Lieutenant. However the entire stipend will be paid in lump sum only on reporting for training at IMA. From the date of joining IMA they will be entitled to full pay and allowances and other benefits as admissible to regular officers. Engineering graduates who possess the prescribed qualification will be granted one year ante-date for purpose of seniority, promotion and increments of pay. 4. Technical Graduate Course Those who are studying in final year/ have completed BE/B Tech in notified streams can also join IMA through Technical Graduate Course. Look out for the advertisement published in leading newspapers/employment news in May/Jun & Nov/Dec every year. Duration of training is one year. Selected candidates will be detailed for training at Indian Military Academy, Dehradun according to their position in the final order of merit up to the number of vacancies available in each subject. On successful completion of training cadets will be granted Permanent Commission in the Army in the rank of Lieutenant. Officers may be granted commission in any Arms/Services and will be liable for service in any part of the world on selected appointments as decided by Army Headquarters from time to time. One year ante date seniority from the date of commission will be granted to Engineering Graduates of TGC Entry. 5. AEC (Men) Candidates who have passed Post Graduate Degree MA/M.Sc in subjects as per Notification /M.Com/MCA/MBA with 1st or 2nd Division are only eligible. Final Year appearing/Result awaiting students are not eligible. Duration of training is one year. Selected candidates will be detailed for training at Indian Military Academy, Dehradun according to their position in the final order of merit up to the number of vacancies available in each subject. On successful completion of training cadets will be granted Permanent Commission in the Army in the rank of Lieutenant. b) Officers Training Academy, Gaya(10+2 (TES) Entry) You can apply after passing your 12th Exams. Minimum aggregate of 70% is mandatory in Physics, Chemistry and Mathematics. You will be detailed for SSB interview based on the cut off as decided by Recruiting Directorate. Look out for the advertisement published in leading newspapers/employment news in May/Oct every year. Duration of training for TES Entry is 5 years, in which first year is Basic Military Training from Officer Training Academy, Gaya and remaining four year of Training at CME Pune or MCTE Mhow or MCEME Secunderabad. The candidates will be awarded Engineering degree after successful completion of training. The candidates will be given a stipend of Rs. 21,000/- p.m. (Rs 15,600/- as pay in Pay Band plus Grade Pay of Rs. 5,400/-) as is admissible to NDA cadets on completion of 3 years training. On completion of 4 Years training they will be commissioned in the rank of Lieutenant and entitled to pay as admissible to the rank. Thank you. If you have any question please don't hesitate to ask in SSB group discussion forum or simply comment below.

  • Facts you should know about Food Waste & "Too Good To Go" Application

    Food waste is a major problem now-a-days in developed countries like America and Europe. Majority of the food waste comes from restaurants and other retail distributors where quality of product is first priority. Some alarming facts about Food wastage which you should know: Roughly one-third of the food produced in the world is wasted every year (approximately 1.3 billion tons). Every year, consumers in rich countries waste almost 222 million tons of food whereas the entire net food production of Sub-Saharan Africa is 230 million tons approximately. Per capita waste by consumers is between 95 to 115 kg per year in Europe and North America, whereas it is only 6-11 kg per year in Sub-Saharan Africa, south and Southeastern Asia. In all over Europe, there is a trend in food sector to throw the remaining fresh food items to uphold the quality of food and hygiene standards. Such practice undoubtedly brings high-value customer satisfaction, trust and is highly commendable but it also causes wastage of food in larger quantity. Certainly developed countries like Europe and USA can afford such "arrogance" but in today’s era of sustainable development goals (SDG) and carbon neutrality, is it not a white-collar crime to keep following such practice in the name of quality maintenance and customer satisfaction? We live in a world where more than 1 billion people suffer from hunger and in every 5 seconds, a child dies because of hunger or of directly related causes. Certainly, the condition is different in developed nations where the priority is more on Quality of food other than its utilization but this ignorance of reality cannot support such activity. Various steps have been taken to reduce the wastage of food. One of such step is the "Too Good To Go" mobile app which is in association with various restaurants of Europe preventing the food wastage by selling it to desired consumer at highly discount rate. It allows both, the consumers and restaurants to prevent the food items from going to bin and using it judiciously. Thus, it saves a large quantity of food from going to the bin and in turn generating a profit out of it. I have been traveling within Europe for past two years and was always perturbed from this act of throwing good quality fresh food to waste bin instead of selling it in lower price to ensure its full utilization. I was not aware of such mobile application, which significantly reduces food wastage and in turn makes the world a better place to live. Nevertheless, such apps should be promoted to ensure not a single grain of food should ever go wasted again.

  • Kafka Producer Example

    In this Apache Kafka tutorial you will learn - How to Install Apache Kafka on Mac using homebrew. To install Kafka on linux machine refer this. Kafka Zookeeper Installation $ brew install kafka Command will automatically install Zookeeper as dependency. Kafka installation will take a minute or so. If you are working in cluster mode, then you need to install it on all the nodes. How to start Kafka & Zookeeper? You don't need to run these commands right now but just for understanding you can see how to start them in output log. To start zookeeper now and restart at login, you need to run this: $ brew services start zookeeper Or, if you don't want/need a background service you can just run: $ zkServer start To start kafka now and restart at login: $ brew services start kafka Or, if you don't want/need a background service you can just run: $ zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & kafka-server-start /usr/local/etc/kafka/server.properties Zookeeper & Kafka Server Configuration You can open Zookeeper properties file to see default configuration, there is not much to explain here. You can see the port number where client (kafka in this case) will connect, directory where snapshot will be stored and the max number of connections per-ip address. $ vi /usr/local/etc/kafka/zookeeper.properties # the directory where the snapshot is stored. dataDir=/usr/local/var/lib/zookeeper # the port at which the clients will connect clientPort=2181 # disable the per-ip limit on the number of connections since this is a non-production config maxClientCnxns=0 Similarly, you can see default Kafka server properties. You just need to change listener settings here to localhost (standalone mode) or change it to ip-address of node in cluster mode. $ vi /usr/local/etc/kafka/server.properties Server basics - Basically you define broker id here, its unique integer value for each broker. Socket server settings - Here, you define the listener hostname and port, by default it's commented out. For this example hostname will be localhost, but in case of cluster you need to mention respective ip-addresses. Setup like this, listeners=PLAINTEXT://localhost:9092 Log basics - Here you define log directory, number of log partitions per topic and recovery thread per data directory. Internal topic settings - Here you can change topic replication factor which is by default 1, usually in production environment its > 1. Log flush policy - By default everything is commented out. Log retention policy - Default retention hour is 168. Zookeeper - Default port number is same which you saw during installation : 2181 Group coordinator settings - This is the rebalance time in milliseconds when new member joins as consumer. Kafka topics are usually multi-subscriber, i.e. there will be multiple consumers to one topic. However, it can have 0,1 or more consumers. Starting Zookeeper & Kafka To start Zookeeper and Kafka, you can start them together like below or run each command separately i.e. start Zookeeper first and then start Kafka. $ zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & kafka-server-start /usr/local/etc/kafka/server.properties This will print a long list of INFO, WARN and ERROR messages. You can scroll back up and look for WARN and ERRORS if any. You can see producer id, broker id in the log, similarly other properties which is setup by default in kafka properties file which I explained earlier. Let this process run, don't kill it. Create a Topic and Start Kafka Producer To create a topic and to start producer, you need to run this command; $ kafka-console-producer --broker-list localhost:9092 --topic topic1 Here my topic name is "topic1" and this terminal will act as producer. You can send messages from this terminal. Start Kafka Consumer Now, start the Kafka consumer, you need to run this command; $ kafka-console-consumer --bootstrap-server localhost:9092 --topic topic1 --from-beginning Bootstrap-server is basically the server to connect to. For this example its localhost with 9092 default port. Screen on the left is Producer and screen on the right is Consumer. You can see how messages are transferred from one terminal to another. Thank you. If you have any question please mention in comments section below. #Kafka

  • Ultimate Minimalist Baby Registry: The Bare Minimum Baby

    Whether you’re a expecting a child of your own or you’re gifting to expectant friends/family, here is a list of absolute must-haves for new parents: Minimalist Must-Haves 1. Sleeping receptacle Whether parents plan on co-sleeping with the popular DockATot or opting for the luxury bassinet Snoo, babies need a place to sleep. If you live in an apartment or are just keeping it simple, opt for a Pack 'n Play (Pack & Play mattress sold separately). Unless your baby is in the 99th percentile in length, this can easily serve as their crib until for up to 1 year. Ikea's crib is also a tasteful, safe, and affordable addition to any nursery (mattress sold separately). Depending on which route you choose, you'll need some bedding. I recommend 4 sets of sheets. If you do your laundry once a week, that should give you enough slack for nighttime accidents/leaks. You can buy waterproof sheets but keep in mind that a lot of mattresses are also waterproof. 2. Travel system If you're driving home from the hospital, you won't be released unless you have a car seat. Do not buy your car seat at the thrift store or even "lightly used". This is not the baby item you should be trying to save money on. These also have a lifespan so don't use an old dusty one your aunt fishes out of her storage unit. There are a multitude of companies who manufacture safe car seats. We personally went with a Chicco Travel system, the easy pop-out feature made travelling a breeze. If you opt out of a car seat (maybe you don't own a car and only use cramped public transit systems) then you can also "wear" your baby out in a carrier. Popular options include a sling, wrap, or front pack. There are a ton of different options for baby carriers depending on your own preference so shop around and do your research. 3. Diaper bag You can use pretty much any kind of bag as a diaper bag so just put some thought into what you find most convenient (tote, backpack, messenger, etc). Of course, bags intended as diaper bags have very convenient pockets for organizing so if you plan on using a daycare or just taking your kiddo place to place, a diaper bag is an excellent investment. 4. Baby wipes Here's where it gets complicated. Many babies have sensitive skin and you won't really know if your baby will react to a certain brand of wipe. So I recommend buying a couple small packs of wipes before you go buying a Costco value box. However, if you're the type who likes to have apocalyptic preparations then consider stocking up on sensitive skin wipes such as the popular Water Wipes. 5. Diapers* A note on diapers - unlike wipes, these have sizes which your child will rapidly outgrow. Your baby shower may likely be flooded with value boxes of diapers. Some parents stock up the Newborn size diapers only to find that their baby outgrew them in just a few days. I recommend buying/registering for a small pack of the newborn and a single larger pack of the size 1 fit. I personally like the Honest diapers (cute prints) but if you're on a budget, generic brands at Target or even Aldi's will work just as nice. 6. Clothing A note on clothing - this is probably one of the most popular baby shower gifts and you'll receive a lot of different outfits. Some baby clothing can be soooo adorable but also be an absolute nightmare to get on and off. I recommend have 7 easy, everyday outfits (with mittens for sharp clawed newborns) that you use in rotation. The zippered sleep n play style is particularly easy. Also, take your local climate in consideration before you stock up. Is your 6 month size fleece snowsuit going to be useful when your child is 6 months in July? We were gifted so many cute outfits that our baby just didn't have time to wear. 7. Swaddling/burp cloths/breast-feeding covers As a catch-all for this category, muslin blankets are just absolutely perfect. These are so large and useful, I recommend have 6 as you'll want one in the diaper bag, the bedroom, and the play area at all times. They also make excellent little play mats when you're outside or at a friend's house (aka aren't sure about putting your baby down, carpet looks questionable). The popular Aden + Anais brand has many cute design options and are very well made and durable. 8. Night light You are going to be getting up at all times of the night and if you don't want to have the abrupt shock of turning on your overhead lighting, you'll definitely need a night light. There are so many different designs and brands so you really have so much freedom to choose what you like. 9. Bottles and pacifiers* A note on bottles and pacifiers - if you plan on breast feeding, it may be difficult to convince your child to switch from one or the other. Consult your local hospital's breast feeding consultant more on this matter. However, if you're on maternity leave, you'll eventually have to return to work so you'll need a nice set of bottles. If you hate the idea of washing bottles, you can also opt for disposable pumping bags like Kiinde ones that your baby can directly drink from. When your baby starts solids, you can also pack these pouches with baby food for on-the-go snacks. We use the Comotomo bottles (4), they tumble around a little in the fridge but we love how easy they are to clean. We also use the BIBS pacifiers, they are truly a lifesaver and other parents at our daycare have asked about them and switched. 10. Boppy The Boppy newborn lounger is just a super easy place to set your baby down, comfy and convenient. The original Boppy is great for breast-feeding and for propping up baby while they learn to sit-up. If you want just one, I have to say that I'd go with the newborn lounger. It was so great and she loved napping in it so much. And its also pretty comfy to use as a pillow for adults if you want to steal it for a bit. 11. Changing pad There's a good amount of accidents that can occur while you're changing your baby and a waterproof changing pad is a good investment. Also, our little one loves to sleep on her Ikea changing pad for some reason. She absolutely hated being in the Snuggle Me Organic but she'd relax and doze off immediately on her changing pad. Kids are strange. 12. Baby bathtub There's nothing more anxiety inducing than giving your baby a bath for the first time so do yourself a favor and get a simple baby bathtub. 13. Towels and washcloths Getting 3-4 towels and 8-10 washcloths is more than enough since you won't be bathing baby everyday. The towels and washcloths can also play double duty as burp clothes or to clean up messes, especially when they start on solid foods. 14. Toiletries A baby wash, sunscreen, and baby ointment are the bare necessities. Due to allergies, you may not want to stock up on these before you figure out if your little one will have a reaction. Our hospital recommended Johnson's baby shampoo so we've used that and we swear by Aquaphor Healing Ointment for diaper rash and just skin irritation. 15. Baby first aid and grooming kit You can register for a kit or compile your own. The first thing I would add is the number for Poison Control (800-222-1222). The second thing you'll need is a baby thermometer so you can check for fevers. You'll also need nail clippers/grinders, a hairbrush, and snot removal device (like the snot-sucker). Consult your doctor for the use of baby pain management and fever reducers (acetaminophen etc). Indulgent Nice-To-Haves 16. Changing table/station 17. Bathtub kneeler 18. Baby swing/bouncer/rocking chair 19. Baby white noise machine 20. Diaper bin 21. Baby monitor 22. Books 23. Toys There are so many things that will depend on your baby's temperament, likes, and dislikes so if at all possible - REGISTER FOR GIFT CARDS. Of course this request depends on your baby shower guest's temperament, likes, and dislikes. But this way, you can react more flexibly instead of having crippling buyer's remorse for that one expensive baby swing that your child cries at the sight of. A cluttered house filled with baby gear can be stressful and anxiety-inducing, especially if the baby doesn't like to use any of it. Good luck and best wishes! #minimalist #parenting #baby #babyregistry #babyregistrylist #babymusthaves #newborn #firsttimeparent #minimalism #minimalistparenting #nursery #apartment #decluttering

  • Adsense Alternatives for Small Websites

    Did your Google Adsense application got rejected? If the answer is yes, you are at right place. Well it's not the end of world and no doubt, there are several other alternatives for Google Adsense for small websites (hosted on WIX, Wordpress, Blogger etc). But the biggest question is - Which one should you vouch for? This is my weekly report of Adsense account (for one month old website) other than dataneb.com. Still thinking.. Why your Google Adsense account got rejected? Simple it's because of poor content on website, probably low traffic, site un-availability during verification, some policy violation etc. It could be due to one or more following reasons. Not Enough Content Website Design & Navigation Missing About Us or Contact Page Low Traffic New Website Poor Quality Content Language Restriction Number of Posts Number of Words per Post Using Free Domain There could be X number of factors and even after spending several hours on research - you will never get the answer. So don't get upset, as I said it's not the end of world, there are several other alternatives to Google Adsense. However, there is no doubt that Google Adsense provides you the easiest and best monetization methods to get steady income from your blogs. So if you have Google Adsense account, utilize it wisely. My rejection reason (but finally got approved) - Thought this might help others. I was providing wrong URL. Yeah I know it's funny and.. a silly mistake. Make sure you are providing the correct website name while submitting Google Adsense request form. My initial two request got denied because I entered wrong website url. First time I entered: http://dataneb.com and second time I entered https://dataneb.com. Correct name was https://www.dataneb.com, yeah I understand it's silly mistake but this is what it is. You need to mention url correctly otherwise Google crawlers will never read your website. and the proud moment, Another reason which is very common but never mentioned is - Google Adsense bots. Yes thats true, Adsense does not look into each request and website content manually. Their advance bots perform the hard task. The problem is the content which is usually generated via a JavaScript/AJAX type of content retrieval, so, since most crawlers, including AdSense, does not execute JavaScript, the crawler never sees your website content, and therefore see's your site as having no content. You will often face this issue with websites like WIX. Well whatever is the reason I would suggest instead of wasting time and money on your existing traffic, you can move forward with other alternatives which has very easy approval process and it will generate similar amount of revenue. Google Adsense approval time? Usually it's within 48 hours but sometimes longer depending upon your quality of website. If you don't get approval in first couple of requests, trust me you are stuck in infinite loop of wait time. I was little lucky in this case, it took 24 hours for me to get the final approval after couple of issues. Maximum number of Adsense units you can place on each page? No restriction, before it was just 3. How to integrate Ads with your website? Just add html code provided by these systems to your html widget anywhere on the page wherever you want to show the Ads. Google Adsense Alternatives I am not going to list down top 5, 10 or top 20 and confuse you more. Instead I am recommending just 3 based on my personal experience, ease of approval and revenue, which helped me to grow my business. I have used them and I am still using them (apart from Google Adsense) for my other websites. So, lets meet our top 3 alternatives to Google Adsense Before these 3 alternatives I would suggest you to try Amazon Affiliates program if you don't have much traffic. However, Amazon does not pay you for clicks or impression, it does only when a sale happens. 1. Media.Net (BEST alternative after Adsense) Approval time - few hours No limitation on number of Ads units per page No hidden fees It's also known as Yahoo/Bing advertising and it provides you contextual ads. It's holding rank 2 in contextual Ads. No minimum traffic requirement Unlike Adsense where you have option to choose image ads, here you have just textual ads There is no limitation on number of Ads unit like Adsense Supports mobile Ads Further you can change the size, color and shapes of Ad unit according to your convenience Monthly payment via Paypal ($100 minimum) 2. Infolinks (It's good.. ) Approval time - few hours It's very simple to integrate with your website It's open to any publisher - small, medium or large scale No fees No minimum requirements for traffic and page views or visitors and no hidden commitments Best part is Infolinks doesn't require space on your blog, they simply convert keywords into advertisement links So when users hover their mouse on specific keywords it automatically shows advertisements It provides in-text advertising and pays you per clicks on ads & not per impression 3. Chitika (It's okay) Approval time - few minutes Language restriction - English only No minimum traffic requirement No limitation on number of Ads per page Payment via Paypal ($10 minimum) or by check ($50 minimum) It target Ads are based on visitors location, so if your posts are location specific this is recommended for you Limitations on custom size of the Ads Similar to Adsense Image quality - medium Conclusion If you have Google Adsense account, use it wisely. If not, move ahead with Media.net. I would suggest just use Media.net and don't over-crowd your good looking website with tons of various types of Ads. Thank you. If you have any question for me please comment below. Good luck!

  • Installing Apache Spark and Scala (Windows)

    Main menu: Spark Scala Tutorial In this Spark Scala tutorial you will learn how to download and install, Apache Spark (on Windows) Java Development Kit (JDK) Eclipse Scala IDE By the end of this tutorial you will be able to run Apache Spark with Scala on Windows machine, and Eclispe Scala IDE. JDK Download and Installation 1. First download JDK (Java Development Kit) from this link. If you have already installed Java on your machine please proceed to Spark download and installation. I have already installed Java SE 8u171/ 8u172 (Windows x64) on my machine. Java SE 8u171 means Java Standard Edition 8 Update 171. This version keeps on changing so just download the latest version available at the time of download and follow these steps. 2. Accept the license agreement and choose the OS type. In my case it is Windows 64 bit platform. 3. Double click on downloaded executable file (jdk*.exe; ~200 MB) to start the installation. Note down the destination path where JDK is installing and then complete the installation process (for instance in this case it says Install to: C:\Program Files\Java\jdk1.8.0_171\). Apache Spark Download & Installation 1. Download a pre-built version of Apache Spark from this link. Again, don't worry about the version, it might be different for you. Choose latest Spark release from drop down menu and package type as pre-built for Apache Hadoop. 2. If necessary, download and install WinRAR so that you can extract the .tgz file that you just downloaded. 3. Create a separate directory spark in C drive. Now extract Spark files using WinRAR, and copy its contents from downloads folder => C:\spark. ​Please note you should end up with directory structure like C:\spark\bin, C:\spark\conf, etc as shown above. Configuring windows environment for Apache Spark 4. Make sure you "Hide file extension properties" in your file explorer (view tab) is unchecked. Now go to C:\spark\conf folder and rename log4j.properties.template file to log4j.properties. You should see filename as log4j.properties and not just log4j. 5. Now open log4j.properties with word pad and change the statement log4j.rootCategory=INFO, console --> log4j.rootCategory=ERROR, console. Save the file and exit, we did this change to capture ERROR messages only when we run Apache Spark, instead of capturing all INFO. 6. Now create C:\winutils\bin directory. Download winutils.exe from GitHub and extract all the files. You will find multiple versions of Hadoop inside it, you just need to focus on Hadoop version which you selected while downloading package type pre-built Hadoop 2.x/3.x in Step 1. Copy all the underlying files (all .dll, .exe etc) from Hadoop version folder and move it into C:\winutils\bin folder. This step is needed to make windows fool as we are running Hadoop. This location (C:\winutils\bin) will act as Hadoop home. 7. Now right-click your Windows menu, Select Control Panel --> System and Security --> System --> “Advanced System Settings” --> then click “Environment Variables” button. Click on "New" button in User variables and add 3 variables: SPARK_HOME c:\spark JAVA_HOME (path you noted while JDK Installation Step 3, for example C:\Program Files\Java\jdk1.8.0_171) HADOOP_HOME c:\winutils 8. Add the following 2 paths to your PATH user variable. Select "PATH" user variable and edit, if not present create new. %SPARK_HOME%\bin %JAVA_HOME%\bin Download and Install Scala IDE 1. Now install the latest Scala IDE from here. I have installed Scala-SDK-4.7 on my machine. Download the zipped file and extract it. That's it. 2. Under Scala-SDK folder you will find eclipse folder, extract it to c:\eclipse. Run eclipse.exe and it will open the IDE (we will use this later). Now test it out! Open up a Windows command prompt in administrator mode. Right click on command prompt in search menu and run as admin. Type java -version and hit Enter to check if Java is properly installed. If you see the Java version that means Java is installed properly. Type cd c:\spark and hit Enter. Then type dir and hit Enter to get a directory listing. Look for any text file, like README.md or CHANGES.txt. Type spark-shell and hit Enter. At this point you should have a scala> prompt as shown below. If not, double check the steps above, check the environment variables and after making change close the command prompt and retry again. Type val rdd = sc.textFile(“README.md”) and hit Enter. Now type rdd.count() and hit Enter. You should get a count of the number of lines from readme file! Congratulations, you just ran your first Spark program! We just created a rdd with readme text file and ran count action on it. Don't worry we will be going through this in detail in next sections. Hit control-D to exit the spark shell, and close the console window. You’ve got everything set up! Hooray! Note for Python lovers - To install pySpark continue to this blog. Thats all! Guys if it's not running, don't worry. Please mention in comments section below and I will help you out with installation process. Thank you. Next: Just enough Scala for Spark Navigation menu ​ 1. Apache Spark and Scala Installation 1.1 Spark installation on Windows​ 1.2 Spark installation on Mac 2. Getting Familiar with Scala IDE 2.1 Hello World with Scala IDE​ 3. Spark data structure basics 3.1 Spark RDD Transformations and Actions example 4. Spark Shell 4.1 Starting Spark shell with SparkContext example​ 5. Reading data files in Spark 5.1 SparkContext Parallelize and read textFile method 5.2 Loading JSON file using Spark Scala 5.3 Loading TEXT file using Spark Scala 5.4 How to convert RDD to dataframe? 6. Writing data files in Spark ​6.1 How to write single CSV file in Spark 7. Spark streaming 7.1 Word count example Scala 7.2 Analyzing Twitter texts 8. Sample Big Data Architecture with Apache Spark 9. What's Artificial Intelligence, Machine Learning, Deep Learning, Predictive Analytics, Data Science? 10. Spark Interview Questions and Answers

  • Spaghetti with Potatoes (Indian Style)

    This spaghetti with Indian spices brings back childhood memories. It is a quick, simple, tasty recipe which everyone can enjoy as a meal anytime of the day! Preparation time: 20 min, Serves: 3-4 Ingredients : Spaghetti Onions : 2 Green Chillies : 2 Ginger : 1/2 inch Potato : 1 (medium sized) Cilantro/Coriander Tomato Ketchup : 1 cup Shredded Cheese Turmeric Powder : 1/2 tsp Red Chilli Powder : 1/2 tsp Dhania Powder ( Coriander Seed Powder) : 1 tsp Salt to taste Preparation Steps : Cook spaghetti as per the instructions mentioned on the packet. While the spaghetti is cooking heat oil in a pan. Add finely chopped ginger, green chilies. Saute for a minute and add the chopped onions. Once the onions are translucent , add the potatoes (finely chopped into small cubes). Saute for 3-4 minutes. Add Turmeric powder, red chili powder, dhania powder. Mix well. Cover the pan and let the potatoes cook. Once the potatoes are cooked, add the cooked spaghetti, tomato ketchup, cilantro and salt. Garnish with cheese and cover the lid so it melts. Serve hot! Hope you all enjoy this recipe!

  • Installing Apache Spark and Scala (Mac)

    Main menu: Spark Scala Tutorial In this Spark Scala tutorial you will learn, How to install Apache Spark on Mac OS. By the end of this tutorial you will be able to run Apache Spark with Scala on Mac machine. You will also download Eclispe for Scala IDE. To install Apache Spark on windows machine visit this. Installing Homebrew You will be installing Apache Spark using Homebrew. So install Homebrew if you don’t have it, visit: https://brew.sh/ and copy paste the command on your terminal and run it. Installing Apache Spark Open terminal and type command brew install apache-spark and hit Enter. Create a log4j.properties file. Type cd /usr/local/Cellar/apache-spark/2.3.1/libexec/conf and hit Enter. Please change the version according to your downloaded version. Spark 2.3.1 is the version installed for me. Type cp log4j.properties.template log4j.properties and hit Enter. Edit the log4j.properties file and change the log level from INFO to ERROR on log4j.rootCategory. We are just changing the log level from INFO to ERROR only. Download Scala IDE Install the Scala IDE from here. Open the IDE once, just to check if it's running fine. You should see panels like this; Test it out! Open terminal and go to the directory where apache-spark was installed to (such as cd /usr/local/Cellar/apache-spark/2.3.1/libexec/) and then type ls to get a directory listing. Look for a text file, like README.md or CHANGES.txt. Type command spark-shell and hit Enter. At this point you should see a scala> prompt. If not, double check the steps above. Type val rdd = sc.textFile("README.md") or whatever text file you’ve found and hit Enter. You have just created a rdd of readme text file. Now type rdd.count() and hit Enter to count the number of lines in text file. You should get a count of number of lines in that file! Congratulations, you just ran your first Spark program! Don't worry about the commands, I will explain them. Sample Execution You’ve got everything set up! If you have any question please don't forget to mention in the comments section below. Main Menu | Next: Just enough Scala for Spark Navigation menu ​ 1. Apache Spark and Scala Installation 1.1 Spark installation on Windows​ 1.2 Spark installation on Mac 2. Getting Familiar with Scala IDE 2.1 Hello World with Scala IDE​ 3. Spark data structure basics 3.1 Spark RDD Transformations and Actions example 4. Spark Shell 4.1 Starting Spark shell with SparkContext example​ 5. Reading data files in Spark 5.1 SparkContext Parallelize and read textFile method 5.2 Loading JSON file using Spark Scala 5.3 Loading TEXT file using Spark Scala 5.4 How to convert RDD to dataframe? 6. Writing data files in Spark ​6.1 How to write single CSV file in Spark 7. Spark streaming 7.1 Word count example Scala 7.2 Analyzing Twitter texts 8. Sample Big Data Architecture with Apache Spark 9. What's Artificial Intelligence, Machine Learning, Deep Learning, Predictive Analytics, Data Science? 10. Spark Interview Questions and Answers

Home   |   Contact Us

©2020 by Data Nebulae