Today’s data managers are challenged with a growing ecosystem of data sources and warehouses, making big data integration more complex than ever. Your data lives in many data warehouses and data lakes; it continually flows in through streams or rests as point-in-time files. Regardless of the source, MapD easily handles data ingestion of millions of records per second into the MapD Core open source SQL engine.
Today’s big data ingestion tools must integrate with a wide variety of data sources and networks. Streaming data originates from sensors, network logs, social media, and web clickstreams from all over the globe. This can produce billions of records per week for large organizations. Streaming ingest engines, such as Apache Kafka, organize and distribute this information before finally funneling it into storage.
Although many platforms offer automated streaming data analytics tools, only MapD can ingest this volume of data and make it available for interactive exploration by business analysts. MapD provides an easy to use utility for Kafka data integration, allowing you to connect to a Kafka topic for real-time consumption of messages and rapid loading into a MapD target table.
Most of the world’s data is at rest, stored in data warehouses, enterprise databases, or Hadoop data lakes. The vast majority of this data has never been explored or analyzed, and it represents an incredible amount of untapped insight. MapD easily supports batch import of data at rest, via these methods:
For Delimited Files:
From Data Lakes or Data Warehouses: