Skip to content
View LucaCanali's full-sized avatar

Organizations

@cerndb

Block or report LucaCanali

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. sparkMeasure sparkMeasure Public

    This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…

    Scala 693 144

  2. Miscellaneous Miscellaneous Public

    Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter note…

    Jupyter Notebook 420 146

  3. cerndb/SparkPlugins cerndb/SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics syst…

    Scala 82 14

  4. cerndb/spark-dashboard cerndb/spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    Dockerfile 111 22

  5. cerndb/SparkTraining cerndb/SparkTraining Public

    Material for the course "Introduction to Apache Spark APIs for Data Processing" https://sparktraining.web.cern.ch/

    Jupyter Notebook 11 5

  6. cerndb/SparkDLTrigger cerndb/SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    Jupyter Notebook 29 13