Tools of the Trade

Data Pipeline

Talend

Talend is an open-source software vendor that provides big data integration, master data management solutions, and enterprise application integration. As the first integration platform built on Spark, Talend gives customers up to 100X better performance than any other platform on the market.  

Logstash

Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously transforms it, and then sends it to your favorite “stash.”

Kafka

Apache Kafka® is a distributed streaming platform and is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Data Storage

Pivotal Greenplum

Pivotal Greenplum is the world’s first fully-featured, multi-cloud, massively parallel processing (MPP) data platform based on the open source Greenplum Database. Pivotal Greenplum provides comprehensive and integrated analytics on multi-structured data. Powered by one of the world’s most advanced cost-based query optimizers, Pivotal Greenplum delivers unmatched analytical query performance on massive volumes of data.

Hadoop​

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

PostgreSQL

PostgreSQL is an object-related database management system. With over 15 years of constant development its known for its reliability and power and is the world's most advanced open source database. Its primary functions are to store secure data, follow best practices, and allow easy retrieval for other software applications. 

Microsoft SQL Server

The foundation of Microsoft’s comprehensive data platform, SQL Server delivers breakthrough performance for mission-critical applications, using in-memory technologies, faster insights from any data to any user in familiar tools like Excel, and a resilient platform for building, deploying, and managing solutions that span on-premises and cloud.

Elasticsearch

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected

Data Analysis

The R Project

R is an open programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls and surveys of data miners are showing R's popularity has increased substantially in recent years

Python

Python is an interpreted high-level programming language for general-purpose programming. It lets you work more quickly and integrate your systems more effectively.

Hive

The Apache Hive™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage.

Spark

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.

Visualization/Front-end

Kibana

Kibana is an open source analytics and visualization platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. Kibana makes it easy to understand large volumes of data. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.

Tableau

Tableau Software helps people see and understand data. Tableau helps anyone quickly analyze, visualize and share information. More than 21,000 customer accounts get rapid results with Tableau in the office and on-the-go. And tens of thousands of people use Tableau Public to share data in their blogs and websites..

Jaspersoft

Jaspersoft allows users to make data-driven decisions inside their currently used apps and business programs. It focuses on individual needs and provides an easy to use platform that scales economically and architecturally to reach a larger audience.

Copyright 2018 QBIX Analytics | 303 Twin Dolphin Drive, Redwood City, CA 94065 | Terms of Service | Privacy & Cookie Policy

Get A

Free Consultation