Uploaded image for project: 'Apache Spark'
  1. Apache Spark
  2. SPARK-740

Spark block manager UI has bug when enabling Spark Streaming

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.7.0, 0.7.1, 0.7.2, 0.7.3
    • Fix Version/s: 0.8.0, 0.7.4
    • Component/s: Block Manager
    • Labels:
      None

      Description

      currently Spark storage ui aggregate RDDInfo using block name, and in block manger, all the block name is rdd__. But in Spark Streaming, block name changes to input--, this will cause a exception when group rdd info using block name in StorageUtils.scala:

      val groupedRddBlocks = infos.groupBy

      { case(k, v) => k.substring(0,k.lastIndexOf('_')) }

      .mapValues(_.values.toArray)

      according to '_' to get rdd name will meet exception when using Spark Streaming.

      java.lang.StringIndexOutOfBoundsException: String index out of range: -1
      at java.lang.String.substring(String.java:1958)
      at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:49)
      at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:48)
      at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:315)
      at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:314)
      at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:178)
      at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:347)
      at scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:314)
      at scala.collection.immutable.HashMap.groupBy(HashMap.scala:38)
      at spark.storage.StorageUtils$.rddInfoFromBlockStatusList(StorageUtils.scala:48)
      at spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:40)
      at spark.storage.BlockManagerUI$$anonfun$5.apply(BlockManagerUI.scala:54)
      ....

      there has two methods:
      1. filter out all the Spark Streaming's input block RDD.
      2. treat Spark Streaming's input RDD as a special case, add code to support this case.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            noootsab Andy Petrella added a comment -

            The same goes with 0.7.3 (fyi )

            Show
            noootsab Andy Petrella added a comment - The same goes with 0.7.3 (fyi )
            Hide
            saisai_shao Saisai Shao added a comment - - edited

            It is fixed in master branch, but do not backport to 0.7.3.

            Show
            saisai_shao Saisai Shao added a comment - - edited It is fixed in master branch, but do not backport to 0.7.3.
            Hide
            matei Matei Zaharia added a comment -

            Oh, can you show me the commit where it's fixed?

            Show
            matei Matei Zaharia added a comment - Oh, can you show me the commit where it's fixed?
            Hide
            saisai_shao Saisai Shao added a comment -

            hi Matei, this issue is fixed in in PR 581(https://github.com/mesos/spark/pull/581).

            Show
            saisai_shao Saisai Shao added a comment - hi Matei, this issue is fixed in in PR 581( https://github.com/mesos/spark/pull/581 ).
            Hide
            matei Matei Zaharia added a comment -

            Thanks. I've now fixed this in branch-0.7 as well.

            Show
            matei Matei Zaharia added a comment - Thanks. I've now fixed this in branch-0.7 as well.

              People

              • Assignee:
                saisai_shao Saisai Shao
                Reporter:
                saisai_shao Saisai Shao
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: