Documentation
< All Topics
Print

2.8 Exploring Data

There are many ways to explore the dataset in Momentum. We will perform a few of them to illustrate the functionality of Momentum. Launch the exploration page by clicking “Data Upload and Exploration” located at the top of the left hand-side menu panel. Here are the steps of data exploration: 

Exploring Data, Types and Distribution 

To understand the data types and column-wise distribution, the steps are: 

  1. Expand the data source, e.g., Ingester Output and select the ingester data you want to explore. 
  1. Click “Explore Data” located at the top menu bar.  
  1. In the next page, the column wise data distribution will display. 
  1. The result shows the data type, total count, number of nulls, min, max, average, and standard deviation of each column. 

A sample data exploration result is shown in Figure 1.12 below. Similarly, data created by other components, such as transformer, machine learning, or NLP, can be explored. 

Figure 2.12 : Data exploration result 

Viewing and Analyzing Data 

Expand the output components, e.g. the “Ingester Output” and click the data component (e.g ingester) you want to explore. This will show 100 records of the data. To show more rows, edit the SQL query shown in the text area and click the ‘blue button’ next to it to run the updated SQL. For example, changing the LIMIT 200 will show 200 rows, ‘LIMIT all’ will show all the data (‘Limit all’ may crash your browser if there is a lot of data). The following Figure 2.13 shows the SQL and data rows. 

Figure 2.13: Data view and corresponding SQL 

Alternatively, you can use Interactive Query to perform ad hoc analysis as described below. 

Ad hoc Analysis Using Interactive Query 

Interactive Query is a powerful data exploration tool that allows you to execute any ANSI-SQL compliant query over data available within Momentum.  

Data within Momentum is organized within the component that generates them. The organization structure is analogous to RDBMS structure in the sense that component name is treated as a database and data generated from various sources as tables of that database. For example, Ingester generated data are organized within “Ingester Output” aliased as “io”. The data tables within the Ingester Output are referenced using fully qualified name as “io.<username>.<tablename>”.  

For example: to explore the machine data to count number of records by Machine_failure, we run the following Interactive Query as shown in Figure 2.14 below. 

SELECT AVG(VIBRATION), NC_MODE FROM io.ai.cnc_historical_data GROUP BY NC_MODE 

Listing 1: Sample SQL statement to count by Machine_failure 

Figure 2.14: Example Interactive Query with sample output 

Visual Analysis 

Visual analysis allows us to plot data to understand the data distribution, outliers, trend, and overall quality of the data. To perform visual analysis, click on “Data Upload & Exploration” and do the following: 

  1. Expand, for example, “Ingester Output”, click on the ingester you want to analyze.  
  1. It will show 100 rows of data. You will notice a graph icon at the top of the query result section (as shown in Figure 2.15 below).  
  1. Clicking on the graph icon will launch a modal window to configure your graph. 

Figure 2.15 Red circle to indicate the graph icon to launch the plot configuration window. 

Figure 2.16: Config example for plotting histogram 

Figure 2.17: An example output of histogram plots 

Downloading Data for Offline Exploration 

  1. Expand, for example, “Ingester Output” or any other component that generated data, select the data you wish to download 
  1. Click “Download Data” located at the top menu bar. 
  1. The data will be downloaded in the format it was originally created, default being the parquet format. 

Note that, depending on the amount of data, it may take a while to generate and download the data from the cluster’s distributed lake to your local computer. 

Table of Contents

Lester Firstenberger

Lester is recognized nationally as a regulatory attorney and expert in consumer finance, securitization, mortgage, and banking law.

Lester is recognized nationally as a regulatory attorney and expert in consumer finance, securitization, mortgage, and banking law. In a variety of capacities, over the past 30 years as an attorney, Mr. Firstenberger has represented the interests of numerous financial institutions in transactions valued in excess of one trillion dollars. He was appointed to and served a three-year term as a member of the Consumer Advisory Council of the Board of Governors of the Federal Reserve System. He has extensive governmental relations experience in the US and Canada at both the federal and state and provincial levels.

Shamshad (Sam) Ansari

Shamshad (Sam) Ansari is the founder, president and CEO of Accure. He drives technology innovations and works with a great team of engineers, data scientists, and business drivers at Accure.

Shamshad (Sam) Ansari is the founder, president, and CEO of Accure. He drives technology innovations and works with a great team of engineers, data scientists, and business drivers at Accure. He takes great pride in working with customers and putting together teams for solving their business problems. Sam is the product architect of Momentum, an AI and automation platform for data engineers, scientists, and business analysts.

Sam brings more than 20 years of technology development and management expertise. He developed, deployed and managed several large scale AI projects. He is a domain expert in healthcare systems, protocols, standards and compliances. Sam is a serial entrepreneur and worked with 4 startups. Prior to starting Accure, he worked with Apixio as the principal architect and director of engineering. He had another successful startup Orbit Solutions where he developed healthcare systems that went through an acquisition. He worked with IBM and the US Government at various capacities.

Sam is a distinguished data scientist, inventor and author. He has several technology publications in his name. He has co-authored 4 US Patents in healthcare AI. He is a well respected authority in computer vision and AI and has authored a book, “Building Computer Vision Applications Using Artificial Neural Networks” that is also translated into other languages including Chinese. Sam contributes to academia as well. He mentors graduate students and sponsors Capstone projects. He is also a member of the Advisory Board, Data Analytics Engineering Department at George Mason University.

Sam has a Master’s degree from Indian Institute of Information Technology & Management, Kerala (IIITM-K) and Bachelor’s degree in engineering from Bihar Institute of Technology Sindri (BIT Sindri).

Moghisuddin Raza

Mogishuddin Raza is a technology leader. As the COO of Accure he is having global product delivery responsibility along with overall strategic and operational responsibility.

Mogishuddin Raza is a technology leader. As the COO of Accure he is having global product delivery responsibility along with overall strategic and operational responsibility.

Having extensive background in technology product development and integration, in particular to Enterprise storage, virtualization, cloud computing, high availability & business continuity technology/solutions, and Big Data & related technologies. Has been passionate and evangelizing the usage of Big data technologies using Momentum to implement advanced analytics (descriptive and predictive) to directly impact the business via an intuitive set of use cases.

Having approximately two decades of experience in high-tech industries which includes big MNCs corporate like EMC Corp and Hewlett-Packard to mid-size organization such as Netkraft, Trados Inc driving transformation in strategizing, planning and architecting product engineering, execution and delivery of high quality products releases within budget & time.

Skilled in all aspects of big MNCs as well as company startups and growth including: strategizing, business planning, market research, finance, product development and profit margins & revenue management. Excellent leadership and people motivation skills. Expert in managing cross-functional, cross cultural global team and building strategic partnership in the global virtual matrix team environment.

Overall, a senior software business professional, skilled in the management of people, resources and partnerships which enables building an eco system for a winning organization.



Rajesh Kumar Nedungadi

Scion of A Former Royal House of Kerala, India President Garuttman Group, USA. Rajesh is an entrepreneur & visionary specializing in International Business Strategy and Market Development with focus on Middle East & North America. With over 20 years’ experience in international trade, Rajesh is an expert on Business Strategy Development, Market Opportunity Development and International Sales & Marketing of Products and Services including the IT Industry. Rajesh is working as Managing Partner / Board Member of many companies including, Globistic Company USA, Castlewick Companies, USA.