Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping [Video]
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping [Video]

50 Hours of Big Data, PySpark, AWS, Scala, and Scraping: Big Data with Scala and Spark, PySpark and AWS, Data Scraping and Data Mining with Python, Mastering MongoDB for Beginners

Profile Icon AI Sciences
By AI Sciences
€37.99
Video Mar 2022 54 hours 32 minutes 1st Edition
Video
€37.99
Subscription
Free Trial
Renews at €18.99p/m
Profile Icon AI Sciences
By AI Sciences
€37.99
Video Mar 2022 54 hours 32 minutes 1st Edition
Video
€37.99
Subscription
Free Trial
Renews at €18.99p/m
Video
€37.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with a video?

Product feature icon Download this video in MP4 format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want

Key benefits

  • Data scraping and data mining for beginners to pro with Python
  • Clear unfolding of concepts with examples in Python, Scrapy, Scala, PySpark, and MongoDB
  • Master Big Data with PySpark and AWS

Description

Part 1 is designed to reflect the most in-demand Scala skills. It provides an in-depth understanding of core Scala concepts. We will wrap up with a discussion on Map Reduce and ETL pipelines using Spark from AWS S3 to AWS RDS (includes six mini-projects and one Scala Spark project). Part 2 covers PySpark to perform data analysis. You will explore Spark RDDs, Dataframes, a bit of Spark SQL queries, transformations, and actions that can be performed on the data using Spark RDDs and dataframes, the ecosystem of Spark and Hadoop, and their underlying architecture. You will also learn how we can leverage AWS storage, databases, computations, and how Spark can communicate with different AWS services. Part 3 is all about data scraping and data mining. You will cover important concepts such as Internet Browser execution and communication with the server, synchronous and asynchronous, parsing data in response from the server, tools for data scraping, Python requests module, and more. In Part 4, you will be using MongoDB to develop an understanding of the NoSQL databases. You will explore the basic operations and explore the MongoDB query, project and update operators. We will wind up this section with two projects: Developing a CRUD-based application using Django and MongoDB and implementing an ETL pipeline using PySpark to dump the data in MongoDB. By the end of this course, you will be able to relate the concepts and practical aspects of learned technologies with real-world problems. All the resources of this course are available at https://github.com/PacktPublishing/50-Hours-of-Big-Data-PySpark-AWS-Scala-and-Scraping

What you will learn

  • Build ETL pipeline from AWS S3 to AWS RDS using Spark
  • Explore Spark/Hadoop applications, ecosystem, and architecture
  • Learn collaborative filtering in PySpark
  • Recognize the distinction between synchronous and asynchronous requests
  • Understand MongoDB CRUD, query operators, projection operators, and update operators
  • Build APIs for CRUD operations in MongoDB through Django

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Mar 30, 2022
Length 54 hours 32 minutes
Edition : 1st Edition
Language : English
ISBN-13 : 9781803237039
Category :
Concepts :

What do you get with a video?

Product feature icon Download this video in MP4 format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want

Product Details

Publication date : Mar 30, 2022
Length 54 hours 32 minutes
Edition : 1st Edition
Language : English
ISBN-13 : 9781803237039
Category :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together

Stars icon
Total 116.97 133.97 17.00 saved
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping
€37.99
DevOps Complete Course
€41.99
Solutions Architect's Handbook
€36.99 €53.99
=
Book stack Total 116.97 133.97 17.00 saved Stars icon

Table of Contents

35 Chapters
1. Part 1 - Data Scraping and Data Mining for Beginners to Pro with Python Chevron down icon Chevron up icon
2. Requests Chevron down icon Chevron up icon
3. Beautiful Soup 4 (BS4) Chevron down icon Chevron up icon
4. CSS Selectors Chevron down icon Chevron up icon
5. Scrapy Chevron down icon Chevron up icon
6. Scrapy Project Chevron down icon Chevron up icon
7. Selenium Chevron down icon Chevron up icon
8. Project Selenium Chevron down icon Chevron up icon
9. Part 2 - Scala and Spark - Master Big Data with Scala and Spark Chevron down icon Chevron up icon
10. Scala Overview Chevron down icon Chevron up icon
11. Flow Control Chevron down icon Chevron up icon
12. Functions Chevron down icon Chevron up icon
13. Classes Chevron down icon Chevron up icon
14. Data Structures Chevron down icon Chevron up icon
15. Project for Scala and Spark Chevron down icon Chevron up icon
16. Part 3 - PySpark and AWS - Master Big Data with PySpark and AWS Chevron down icon Chevron up icon
17. Introduction to Hadoop, Spark Ecosystems and Architectures Chevron down icon Chevron up icon
18. Spark RDDs Chevron down icon Chevron up icon
19. Spark DFs Chevron down icon Chevron up icon
20. Collaborative Filtering Chevron down icon Chevron up icon
21. Spark Streaming Chevron down icon Chevron up icon
22. ETL Pipeline Chevron down icon Chevron up icon
23. Project - Change Data Capture / Replication On Going Chevron down icon Chevron up icon
24. Part 4 - MongoDB-Mastering MongoDB for Beginners (Theory and Projects) Chevron down icon Chevron up icon
25. Overview Chevron down icon Chevron up icon
26. Basic Mongo Operations Chevron down icon Chevron up icon
27. Basic Update Operation Chevron down icon Chevron up icon
28. Basic Read Operation Chevron down icon Chevron up icon
29. Basic Delete Operation Chevron down icon Chevron up icon
30. Query and projection operators Chevron down icon Chevron up icon
31. Update Operators Chevron down icon Chevron up icon
32. Mongo with Node Chevron down icon Chevron up icon
33. Mongo with Python Chevron down icon Chevron up icon
34. Django with Mongo Chevron down icon Chevron up icon
35. Spark with Mongo Chevron down icon Chevron up icon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How can I download a video package for offline viewing? Chevron down icon Chevron up icon
  1. Login to your account at Packtpub.com.
  2. Click on "My Account" and then click on the "My Videos" tab to access your videos.
  3. Click on the "Download Now" link to start your video download.
How can I extract my video file? Chevron down icon Chevron up icon

All modern operating systems ship with ZIP file extraction built in. If you'd prefer to use a dedicated compression application, we've tested WinRAR / 7-Zip for Windows, Zipeg / iZip / UnRarX for Mac and 7-Zip / PeaZip for Linux. These applications support all extension files.

How can I get help and support around my video package? Chevron down icon Chevron up icon

If your video course doesn't give you what you were expecting, either because of functionality problems or because the content isn't up to scratch, please mail customercare@packt.com with details of the problem. In addition, so that we can best provide the support you need, please include the following information for our support team.

  1. Video
  2. Format watched (HTML, MP4, streaming)
  3. Chapter or section that issue relates to (if relevant)
  4. System being played on
  5. Browser used (if relevant)
  6. Details of support
Why can’t I download my video package? Chevron down icon Chevron up icon

In the even that you are having issues downloading your video package then please follow these instructions:

  1. Disable all your browser plugins and extensions: Some security and download manager extensions can cause issues during the download.
  2. Download the video course using a different browser: We've tested downloads operate correctly in current versions of Chrome, Firefox, Internet Explorer, and Safari.