Deprecated functions in org.apache.spark.sql. functions in Spark 2.0

I just moved some of my spark codes from 1.6.0 to 2.2.0 and discovered that some functions in org.apache.spark.sql.functions._ are being replaced/renamed.

To name a few, see below

1) rowNumber() is replaced by row_number()

import org.apache.spark.sql.functions._
/**
* @group window_funcs
* @deprecated As of 1.6.0, replaced by `row_number`. This will be removed in Spark 2.0.
*/
@deprecated("Use row_number. This will be removed in Spark 2.0.", "1.6.0")
def rowNumber(): Column = row_number()

2) isNaN is replaced by isnan

/**
   * @group normal_funcs
   * @deprecated As of 1.6.0, replaced by `isnan`. This will be removed in Spark 2.0.
   */
  @deprecated("Use isnan. This will be removed in Spark 2.0.", "1.6.0")
  def isNaN(e: Column): Column = isnan(e)

3) inputFileName() is replaced by input_file_name

/**
   * @group normal_funcs
   * @deprecated As of 1.6.0, replaced by `input_file_name`. This will be removed in Spark 2.0.
   */
  @deprecated("Use input_file_name. This will be removed in Spark 2.0.", "1.6.0")
  def inputFileName(): Column = input_file_name()

To get the full list of all the replaced/renamed functions, refer to this code
https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/functions.scala

Scala Enumeration

In Java we use enum to represent fixed set of constants

For example, we would define days of week enum type as follows

public enum Day {
    SUNDAY, MONDAY, TUESDAY, WEDNESDAY,
    THURSDAY, FRIDAY, SATURDAY 
}

In Scala, we can do the same thing by extending Enumeration, for example

object Day extends Enumeration {
  type Day = Value
  val SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY = Value
}

You can find examples of Scala Enumeration usage in Spark
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TaskState.scala

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/InputMetrics.scala

Render Json using Jackson in Scala

If you use Jackson Json library in Scala, remember to register the DefaultScalaModule so that ObjectMapper can convert List, Array to Json correctly. See below.

 
val objectMapper = new ObjectMapper()
objectMapper.registerModule(DefaultScalaModule)

Simple example:

 
import com.fasterxml.jackson.annotation.JsonAutoDetect.Visibility
import com.fasterxml.jackson.annotation.{JsonProperty, PropertyAccessor}
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule



object JsonExample {
  case class Car(@JsonProperty("id")  id: Long)
  case class Person(@JsonProperty("name") name: String = null,
                    @JsonProperty("cars") cars: Seq[Car] = null)

  def main(args:Array[String]):Unit = {
    val car1 = Car(12345)
    val car2 = Car(12346)
    val carsOwned = List(car1, car2)
    var person = Person(name="wei", cars=carsOwned)

    val objectMapper = new ObjectMapper()
    objectMapper.registerModule(DefaultScalaModule)
    objectMapper.setVisibility(PropertyAccessor.ALL, Visibility.NONE)
    objectMapper.setVisibility(PropertyAccessor.FIELD, Visibility.ANY)
    println(s"person: ${objectMapper.writeValueAsString(person)}")
  }
}

Output:
person: {“name”:”wei”,”cars”:[{“id”:12345},{“id”:12346}]}