Google Datalab Exploration, Analysis, Data Visualization, And Machine Learning
   

Google Datalab Exploration, Analysis, Data Visualization, and Machine Learning

Google Datalab Exploration, Analysis, Data Visualization, and Machine Learning

Google Datalab Exploration

An interactive tool that is easy to use for exploration, analysis, data visualization, and machine learning.

Google today launched Cloud Datalab. Cloud Datalab is an interactive and reliable tool created to explore, analyze, transform, and visualize data and create machine learning models on the Google Cloud Platform. This tool can run on Compute Engine and easily connect to many cloud services so you can focus on your data science tasks.

Google Datalab Exploration, Analysis, Data Visualization, And Machine Learning
Google Datalab Exploration, Analysis, Data Visualization, And Machine Learning

This service uses a Jupyter notebook (formerly known as IPython), a format that allows you to create documents with direct code and visualization. Jupyter is quite famous in the world of data science, and a growing ecosystem has developed around it, which should make this new Google tool easier too.

To get started, you must first use Cloud Datalab as an App Engine application – and that’s where fees for using the service will enter after the free beta period (Google has not released pricing information). After completion, you can start a new project and prepare a new notebook; This service will come with several notebooks installed to help you get started.

What’s cool here is that Datalab is open source, and developers who want to extend it can only cut it and/or send pull requests on GitHub.

  1. Integrated & open source

Cloud Datalab was built on Jupyter (formerly IPython), which has a thriving module ecosystem and a strong knowledge base. Cloud Datalab allows analysis of your data in BigQuery, Cloud Machine Learning Engine, Compute Engine, and Cloud Storage using Python, SQL, and JavaScript (for user-defined BigQuery functions).

2. Scalable

Whether you are analyzing data in megabytes or terabytes, Cloud Datalab can help you. Query terabytes of data in BigQuery, run a local analysis of sample data and run training tasks on terabytes of data in the Cloud Machine Learning Engine without problems.

3. Data management and visualization

Use Cloud Datalab to get information from your data. Explore, transform, analyze, and visualize your data interactively using BigQuery, Cloud Storage, and Python.

4. Machine learning with process cycle support

Departing from data to machine learning (ML) models that are deployed and ready for prediction. Explore data, create, evaluate, and optimize machine learning models using TensorFlow or Cloud Machine Learning Engine.

8 Google Datalab features that you should know about

1. Integrated

Cloud Datalab simplifies data processing with BigQuery Cloud, Cloud Machine Learning Engine, Cloud Storage, and Stackdriver Monitoring. Authentication, cloud computing, and source control have been handled since the initial use.

2. Multilingual support

Cloud Datalab currently supports Python, SQL, and JavaScript (for user-defined BigQuery functions).

3. Format the notebook

Cloud Datalab combines code, documentation, results, and visualizations into one in an intuitive notebook format.

4. Price per use

Simply pay for the cloud resources you use: VM Compute Engine, BigQuery, and other additional resources used, such as Cloud Storage.

5. Interactive data visualization

Use Google Charting or matplotlib to facilitate visualization.

6. Machine learning

Supports TensorFlow-based deep ML models in addition to sci-kit-learn. Scale training and predictions through a special library for the Cloud Machine Learning Engine.

7. IPython support

Datalab is based on Jupyter (formerly called IPython) so you can use many existing packages for statistics, machine learning, etc. Learn from published notebooks and exchange tips with the busy IPython community.

8. Open Source

Developers who want to expand Datalab can copy and / or submit pull requests to projects hosted on GitHub.

Cloud Datalab Pricing

There are no fees to use Google Cloud Datalab. However, you pay for any Google Cloud Platform resources you use with Datalab Cloud, for example:

1. Calculate resources: You are charged from the time of creation until the removal of the Cloud Datalab VM virtual machine. The default Cloud Datalab VM machine type is n1-standard-1, but you can choose a different machine type. You are also charged for the 20GB Standard Persistent Disk, which is used as a Boot Disk, and the 200GB Standard Persistent Disk, where the user’s notebook is stored (see Storage resources). The 20GB boot disk is deleted when the VM instance is deleted, but the 200GB disk remains after the VM deletion until you delete it. The following command removes VM instances and 20GB boot disks and 200GB user notebook disks.

2. Storage resources: Notebooks are saved to Persistent Disk and are backed up to Google Cloud Storage (see Determination of persistent disk prices and Cloud Storage prices).

3. Data Analysis Services: You are charged with Google BigQuery fees when issuing SQL questions in the Cloud Datalab notebook (see BigQuery Prices). Additionally, when you use Google Cloud Machine Learning, you may be charged for Cloud Learning Machine and / or Google Cloud Dataflow.

4. Other resources: You may be charged for other API requests that you make in the Cloud Datalab notebook environment. This fee will vary according to the API.

If you are interested and want to do calculations, we provide Google Cloud Pricing Calculator

Getting Started with the Google Cloud Platform. The following video explanation

Related Articles