Apache Spark
  1. Apache Spark
  2. SPARK-740

Spark block manager UI has bug when enabling Spark Streaming

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.7.0, 0.7.1, 0.7.2, 0.7.3
    • Fix Version/s: 0.8.0, 0.7.4
    • Component/s: Block Manager
    • Labels:
      None

      Description

      currently Spark storage ui aggregate RDDInfo using block name, and in block manger, all the block name is rdd__. But in Spark Streaming, block name changes to input--, this will cause a exception when group rdd info using block name in StorageUtils.scala:

      val groupedRddBlocks = infos.groupBy

      { case(k, v) => k.substring(0,k.lastIndexOf('_')) }

      .mapValues(_.values.toArray)

      according to '_' to get rdd name will meet exception when using Spark Streaming.

      java.lang.StringIndexOutOfBoundsException: String index out of range: -1
      at java.lang.String.substring(String.java:1958)
      at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:49)
      at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:48)
      at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:315)
      at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:314)
      at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:178)
      at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:347)
      at scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:314)
      at scala.collection.immutable.HashMap.groupBy(HashMap.scala:38)
      at spark.storage.StorageUtils$.rddInfoFromBlockStatusList(StorageUtils.scala:48)
      at spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:40)
      at spark.storage.BlockManagerUI$$anonfun$5.apply(BlockManagerUI.scala:54)
      ....

      there has two methods:
      1. filter out all the Spark Streaming's input block RDD.
      2. treat Spark Streaming's input RDD as a special case, add code to support this case.

        Activity

        Hide
        Andy Petrella added a comment -

        The same goes with 0.7.3 (fyi )

        Show
        Andy Petrella added a comment - The same goes with 0.7.3 (fyi )
        Hide
        Saisai Shao added a comment - - edited

        It is fixed in master branch, but do not backport to 0.7.3.

        Show
        Saisai Shao added a comment - - edited It is fixed in master branch, but do not backport to 0.7.3.
        Hide
        Matei Zaharia added a comment -

        Oh, can you show me the commit where it's fixed?

        Show
        Matei Zaharia added a comment - Oh, can you show me the commit where it's fixed?
        Hide
        Saisai Shao added a comment -

        hi Matei, this issue is fixed in in PR 581(https://github.com/mesos/spark/pull/581).

        Show
        Saisai Shao added a comment - hi Matei, this issue is fixed in in PR 581( https://github.com/mesos/spark/pull/581 ).
        Hide
        Matei Zaharia added a comment -

        Thanks. I've now fixed this in branch-0.7 as well.

        Show
        Matei Zaharia added a comment - Thanks. I've now fixed this in branch-0.7 as well.

          People

          • Assignee:
            Saisai Shao
            Reporter:
            Saisai Shao
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: