gem install mysql2 -- --srcdir=/usr/local/mysql/include
Monday, February 21, 2022
fatal error: Unable to locate credentials
Command:
$ aws s3 rm s3://YOUR_BUCKET/ --recursive --dryrun --exclude "*" --include "my-folder/*"
Run aws configure and provide the access key and secret key when prompted
Unsupported Python version detected: Python 2.7
Error:
Unsupported Python version detected: Python 2.7
To continue using this installer you must use Python 3.6 or later.
For more information see the following blog post: https://aws.amazon.com/blogs/developer/announcing-end-of-support-for-python-2-7-in-aws-sdk-for-python-and-aws-cli-v1/
Change this command:
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
to
sudo /usr/bin/python3 ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
Thursday, February 10, 2022
pyspark Installation
Install Latest Python
sudo apt update
sudo apt -y upgrade
python3 -V
Install Development Tools
sudo apt install -y build-essential libssl-dev libffi-dev python3-dev
sudo apt install software-properties-common -y
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10
Install PIP
sudo apt install -y python3-pip
Install pyspark
pip install pyspark
source ~/.profile
sudo apt install default-jdk scala git
Verify Spark Installation
pyspark
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Welcome to SPARK version 3.2.1
Using Python version 3.8.10 (default, Nov 26 2021 20:14:08)
Spark context Web UI available at http://dollar.lan:4040
Spark context available as 'sc' (master = local[*], app id = local-1644524108365).
SparkSession available as 'spark'.
>>> big_list = range(1000)
>>> rdd = sc.parallelize(big_list, 2)
>>> odds = rdd.filter(lambda x:x %2 != 0)
>>> odds.take(5)
[1, 3, 5, 7, 9]