Traditionally, integration requirements arise from the need to connect silos. More recently, the impetus may have shifted to bridge cloud and on-premise systems. Benefits of integrating disparate silos are numerous. Mining compartmentalized information for Business Intelligence would be difficult to impossible. Partial data availability degrades operational efficiency and impairs decision making. Data volume, timeliness, performance impact, error handling, single point of failure, schema and format incompatibilities, and connection management are among the key concerns in any integration project.
BigR.io is equipped to handle all integration challenges, with broad engineering expertise.
- Message Queues – buffer the asynchronicity between sending and receiving parties. Queues are designed to withstand intermittent loss of connectivity and guarantee durability. Advanced systems may also include features such as multiple consumers, once only or at-least-once delivery, and per-client position indexing.
- Enterprise Service Buses (ESBs) – provide agility and flexibility between communicating parties. ESBs can, for example, implement routing rules for multiple receivers based on topic. Its programmability allows for fine-grained schema translation on the fly. There is an entire class of integration patterns (aggregator, channel adapter, service activator, etc.) that ESBs can implement.
- XML and JSON Messaging – many messaging standards exist to serve common exchange requirements. For example, MTOM provides efficient transmission of binary data in XML. Very sophisticated libraries now exist to accommodate evolving JSON schemas.
- Hadoop Database Unification – open source solutions are available to query from multiple data sources in a single query. These tools are designed with query optimization and predicate push down to achieve response speed close to single database queries.
- Real-Time Schema Translation – in-¬≠memory computing technologies, such as Spark, process transformation logic on the fly. For example, a JDBC interface can be provided to add a relational interface to a non-relational NoSQL database. The general purpose and high performance nature of such a tool makes it a prime candidate for custom integration projects.
- Open Source Solutions – Hadoop, Spark, and other Big Data innovations that fully leverage parallelism, advanced error handling, retry logic, and logging support are ideal for this application.
- System wide error handling – system wide monitoring is necessary because no one system component provides all the clues as to where an error occurred in a long data pipeline. An eclectic use of logging and network monitoring tools can centralize all the error events in one dashboard, to facilitate immediate diagnosis.
- Cloud to On-Premise Bridges – a new breed of ETL tools with cloud-specific features meets the need to move message and data across the cloud-enterprise boundary. A combination of local agent caching and network encryption extends the data capacity of a client company and opens up an entire new domain of computing resources.