Big Data · Data Engineering · programming · Uncategorized

Spark cheatsheet

Mount S3 bucket def mountBucket(accesskey, secretkey, bucketName, mountFolder): ACCESS_KEY_ID = accesskey SECRET_ACCESS_KEY = secretkey print (“Mounting”, bucketName) try: # Unmount the data in case it was already mounted. dbutils.fs.unmount(mountFolder) except: # If it fails to unmount it most likely wasn’t mounted in the first place print (“Directory not unmounted: “, mountFolder ) finally: # Lastly,… Continue reading Spark cheatsheet