2017 May | SQL or NoSql

Archive for May, 2017

Data alignment in block devices

CentOS 7 [root@]# cat /etc/redhat-release CentOS Linux release 7.3.1611 (Core) Problem: mkfs.xfs warning: device is not properly aligned [root@]# parted -a optimal /dev/mapper/mpathb mkpart primary 0% 100% [root@]# parted /dev/mapper/mpathb align-check opt 1 1 aligned parted /dev/mapper/mpathb p Model: Linux device-mapper (multipath) (dm) Disk /dev/mapper/mpathb: 21.5GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: […]

Spark and CSV and SQL

SQL data_frame = spark.read.csv(“/db/nowe/APLUS_LOG_MINMAX_1183241001_LoggerData.csv”, header=True).select(“ID”, “UTC”).limit(200) data_frame.createOrReplaceTempView(“my_table”) # What happened? # data_frame.printSchema() spark.sql(“desc my_table”).show() # Wow # data_frame.first() spark.sql(“select * from my_table limit 1”).show() # data_frame.withColumnRenamed spark.sql(“select ID as SOME_ID from my_table limit 1”).show() # Casting… data_frame.select(data_frame.ID.cast(“float”)).show(2) spark.sql(“select CAST(ID as FLOAT) as SOME_ID from my_table limit 1”).show() # Now, cast ID to float, then get […]

Spark and CSV for python language

Now, we have 2017 year, second quarter. It seems that in one year the a/m instruction are not to be adequate. Simple instructions: # Open file and use first columns as header data_frame = spark.read.csv(“/path/to/file.csv”, header=True) # You received basic Spark type – DataFrame. # See what structure how looks the structure data_frame.printSchema() root |– […]

Information

Change this sentence and title from admin Theme option page.

Data alignment in block devices

Spark and CSV and SQL

Spark and CSV for python language

Information

Recent entry

Archive

Category