Big Data · Data Engineering · programming · Uncategorized

Spark cheatsheet

Mount S3 bucket def mountBucket(accesskey, secretkey, bucketName, mountFolder): ACCESS_KEY_ID = accesskey SECRET_ACCESS_KEY = secretkey print (“Mounting”, bucketName) try: # Unmount the data in case it was already mounted. dbutils.fs.unmount(mountFolder) except: # If it fails to unmount it most likely wasn’t mounted in the first place print (“Directory not unmounted: “, mountFolder ) finally: # Lastly,… Continue reading Spark cheatsheet

Big Data · data science · machine learning · programming

Apache Hadoop (projects)

QUESTIONS setInputFormat comparator top k frequent words HADOOP SYSTEM Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. HDFS(Hadoop distributed file system): data storage (data split and data replication) Map Reduce(data processing): how to leverage job; how do nodes communicate; how to deal with node… Continue reading Apache Hadoop (projects)


JAVA Basics

Questions about JAVA: static (shared by all objects, owned by class), combined with final (unchanged) 序列化, serializable:通用的语言 for communication and persistence string builder 正则表达式:regex, //s, //s+ Integer/int, Character/char? singleton Iterator iterable static block this() in constructor gnu trove script language 32-BITs SYSTEM and 64-BITs SYSTEM 232 − 1 = 4294967295 = 4 GiB − 1… Continue reading JAVA Basics


LintCode Diary

JAVA PROBLEM 全局变量,常量在函数间地传递(376,381的区别) String.valueOf(node.val),  把数字加到字符串里面 comparator(compare, equal) Collections.sort(ascending and descending) 如果class只有class没有Public,什么意思 背诵 subsets模板 OTHER PROBLEM graph coloring ATTENTION! Think about exceptions before starting: capitalized characters, repeated elements in an array/vector. Check special cases at the beginning: null strings/arrays, empty strings/arrays (one of them is empty or both of them are empty, LintCode13), two variables are… Continue reading LintCode Diary

C++ · programming

C++ programming basics

Class: Passing classes to functions: when a class instance is passed by reference, changes are reflected in the original. Recall that assigning one class instance to another copies all fields Constructors: the same name as the class, can accept parameters, no return value. There can be multiple constructors. If a constructor with parameters is defined, the default… Continue reading C++ programming basics