Recent Posts

Tuesday, 20 November 2018

Apache Kafka - Introduction

     In Big Data, an enormous volume of data is used. Regarding data, we have two main challenges. The first challenge is how to collect large volume of data and the second challenge is to analyze the collected data. To overcome those challenges, you must need a messaging system.

What is a Messaging System?
     A Messaging System is responsible for transferring data from one application to another, so the applications can focus on data, but not worry about how to share it. Distributed messaging is based on the concept of reliable message queuing. Messages are queued asynchronously between client applications and messaging system. Two types of messaging patterns are available - one is point to point and the other is publish-subscribe (pub-sub) messaging system. Most of the messaging patterns follow pub-sub.

Point to Point Messaging System
     In a point-to-point system, messages are persisted in a queue. One or more consumers can consume the messages in the queue, but a particular message can be consumed by a maximum of one consumer only. Once a consumer reads a message in the queue, it disappears from that queue. In the p2p model, the producer is called a sender and the consumer is called a receiver. The most important characteristics of the point-to-point model are as follows

☀ Messages are exchanged through a virtual channel called a queue. A queue is a destination to which producers send messages and a source from which receivers consume messages.

☀ Each message is delivered to only one receiver. Multiple receivers may listen on a queue, but each message in the queue may only be consumed by one of the queue’s receivers.

☀ Messages are ordered. A queue delivers messages to consumers in the order they were placed in the queue by the message server. As messages are consumed, they are removed from the head of the queue (unless message priority is used).

☀ There is no coupling of the producers to the consumers. Receivers and senders can be added dynamically at run time, allowing the system to grow or shrink in complexity over time. (This is a characteristic of messaging systems in general.)

      There are two types of point-to-point messaging: asynchronous fire-and-forget processing and asynchronous request/reply processing. With fire-and-forget processing, the message producer sends a message to a queue and does not expect to receive a response. This type of processing can be used to trigger an event or make a request to a receiver to execute a particular action that does not require a response (or in some cases, an immediate response). For instance, you may want to use asynchronous fire-and-forget processing to send a message to a logging system, make a request to kick off a report, or trigger an event on another process. Asynchronous fire-and-forget processing is illustrated in following figure.
      With asynchronous request/reply processing, the message producer sends a message on one queue and then does a blocking wait on a reply queue waiting for the response from the receiver. The request/reply processing provides for a high degree of decoupling between the producer and consumer, allowing the message producer and consumer components to be heterogeneous languages or platforms. Asynchronous request/reply processing is illustrated in following figure.


Publish-Subscribe Messaging System
      This messaging model is used when you need to broadcast an event or message to many message consumers. Unlike the point-to-point messaging model, all message consumers (called subscribers) listening on the topic will receive the message. The publish-and-subscribe model is more commonly referred to as the pub/sub model. In this model, the message producer is called a publisher and the message consumer called a subscriber. Messages are published to a topic as opposed to being sent to a queue, as in the point-to-point model. The most important characteristics of the pub/sub model are as follows

☀ Messages are exchanged through a virtual channel called a topic.

☀ Each message is delivered to multiple message consumers, called subscribers. There are many types of subscribers, including durable, non durable, and dynamic.

☀ The publisher generally does not know and is not aware of which subscribers are receiving the topic messages.

☀ Messages are pushed to consumers, which means that consumers are delivered messages without having to request them. Messages are exchanged through a virtual channel called a topic. A topic is a destination where producers can publish, and subscribers can consume, messages. Messages delivered to a topic are automatically pushed to all qualified consumers.

☀ As in enterprise messaging in general, there is no coupling of the producers to the consumers. Subscribers and publishers can be added dynamically at runtime, which allows the system to grow or shrink in complexity over time.

☀ Every client that subscribes to a topic receives its own copy of messages published to that topic. A single message produced by one publisher may be copied and distributed to hundreds or even thousands of subscribers.


Next Tutorial  Apache Kafka - Features


1 comment: