Skip to content
View Pushkr's full-sized avatar
👨‍💻
Solving Data Engineering problems..
👨‍💻
Solving Data Engineering problems..

Block or report Pushkr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The Python library for names.

Python 892 155 Updated Apr 9, 2025

List of Computer Science courses with video lectures.

68,480 9,261 Updated Apr 12, 2025

My notes for AWS Solutions Architect Associate.

1,665 479 Updated Jul 26, 2023

This is a repo documenting the best practices in PySpark.

Jupyter Notebook 462 77 Updated Dec 8, 2022

A collection of inspiring lists, manuals, cheatsheets, blogs, hacks, one-liners, cli/web tools and more.

163,133 10,247 Updated Nov 19, 2024

An evolving how-to guide for securing a Linux server.

18,002 1,149 Updated Oct 19, 2024

Resumes generated using the GitHub informations

JavaScript 62,364 1,357 Updated Feb 15, 2023

😎 Awesome lists about all kinds of interesting topics

355,964 28,909 Updated Apr 8, 2025

A curated list of awesome big data frameworks, ressources and other awesomeness.

13,555 2,568 Updated Feb 14, 2025

A convenient Python wrapper for Apache NiFi

Python 256 75 Updated Apr 1, 2025

A curated list of data engineering tools for software developers

7,291 1,313 Updated Apr 7, 2025

Retrying library for Python

Python 7,283 291 Updated Apr 2, 2025

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.

Java 88 15 Updated Mar 5, 2024

ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.

Java 282 97 Updated Feb 27, 2019

Data cleansing tutorial for chipy scientific SIG

Jupyter Notebook 8 8 Updated Feb 18, 2016

📚 Parameterize, execute, and analyze notebooks

Python 6,129 437 Updated Apr 7, 2025

📘 The interactive computing suite for you! ✨

TypeScript 6,242 552 Updated Dec 30, 2023

SparkOnHBase

Scala 279 177 Updated Mar 30, 2021

A python Web HDFS based tool for inter/intra-cluster data copying.

Python 9 4 Updated Aug 27, 2020

The Python micro framework for building web applications.

Python 69,303 16,361 Updated Mar 30, 2025

the only cheat sheet you need

Python 39,243 1,810 Updated Feb 1, 2025

50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Jenkins, TeamCity, Alpine, CentOS, Debian, Fedora, Ubuntu, Hadoop, Kafka, ZooKeeper, HBase, Cassandra,…

Shell 1,338 472 Updated Mar 14, 2025

Learn how to use Spark SQL and HSpark connector package to create / query data tables that reside in HBase region servers

69 27 Updated Oct 18, 2022

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Go 15,674 1,194 Updated Jan 6, 2025

📖 A collection of pure bash alternatives to external processes.

Shell 37,013 3,328 Updated Nov 28, 2023

关于Python的面试题

Shell 16,801 5,555 Updated Mar 5, 2025

A list of helpful Scala related questions you can use to interview potential candidates.

504 86 Updated Mar 21, 2017

A curated list of awesome Apache Spark packages and resources.

Shell 1,782 336 Updated Oct 24, 2024

Examples for High Performance Spark

Scala 508 234 Updated Nov 3, 2024
Next