StreamingQueryProgress (Spark 3.5.5 JavaDoc) (original) (raw)
Object
- org.apache.spark.sql.streaming.StreamingQueryProgress
All Implemented Interfaces:
java.io.Serializable
public class StreamingQueryProgress
extends Object
implements scala.Serializable
Information about progress made in the execution of a StreamingQuery during a trigger. Each event relates to processing done for a single trigger of the streaming query. Events are emitted even when no new data is available to be processed.
param: id A unique query id that persists across restarts. See StreamingQuery.id()
. param: runId A query id that is unique for every start/restart. See StreamingQuery.runId()
. param: name User-specified name of the query, null if not specified. param: timestamp Beginning time of the trigger in ISO8601 format, i.e. UTC timestamps. param: batchId A unique id for the current batch of data being processed. Note that in the case of retries after a failure a given batchId my be executed more than once. Similarly, when there is no data to be processed, the batchId will not be incremented. param: batchDuration The process duration of each batch. param: durationMs The amount of time taken to perform various operations in milliseconds. param: eventTime Statistics of event time seen in this batch. It may contain the following keys:
"max" -> "2016-12-05T20:54:20.827Z" // maximum event time seen in this trigger
"min" -> "2016-12-05T20:54:20.827Z" // minimum event time seen in this trigger
"avg" -> "2016-12-05T20:54:20.827Z" // average event time seen in this trigger
"watermark" -> "2016-12-05T20:54:20.827Z" // watermark used in this trigger
All timestamps are in ISO8601 format, i.e. UTC timestamps. param: stateOperators Information about operators in the query that store state. param: sources detailed statistics on data being read from each of the streaming sources.
Since:
2.1.0
See Also:
Serialized Form
Method Summary
All Methods Instance Methods Concrete Methods
Modifier and Type Method and Description long batchDuration() long batchId() java.util.Map<String,Long> durationMs() java.util.Map<String,String> eventTime() java.util.UUID id() double inputRowsPerSecond() The aggregate (across all sources) rate of data arriving. String json() The compact JSON representation of this progress. String name() long numInputRows() The aggregate (across all sources) number of records processed in a trigger. java.util.Map<String,Row> observedMetrics() String prettyJson() The pretty (i.e. double processedRowsPerSecond() The aggregate (across all sources) rate at which Spark is processing data. java.util.UUID runId() SinkProgress sink() SourceProgress[] sources() StateOperatorProgress[] stateOperators() String timestamp() String toString() * ### Methods inherited from class Object `equals, getClass, hashCode, notify, notifyAll, wait, wait, wait`
Method Detail
* #### id public java.util.UUID id() * #### runId public java.util.UUID runId() * #### name public String name() * #### timestamp public String timestamp() * #### batchId public long batchId() * #### batchDuration public long batchDuration() * #### durationMs public java.util.Map<String,Long> durationMs() * #### eventTime public java.util.Map<String,String> eventTime() * #### stateOperators public [StateOperatorProgress](../../../../../org/apache/spark/sql/streaming/StateOperatorProgress.html "class in org.apache.spark.sql.streaming")[] stateOperators() * #### sources public [SourceProgress](../../../../../org/apache/spark/sql/streaming/SourceProgress.html "class in org.apache.spark.sql.streaming")[] sources() * #### sink public [SinkProgress](../../../../../org/apache/spark/sql/streaming/SinkProgress.html "class in org.apache.spark.sql.streaming") sink() * #### observedMetrics public java.util.Map<String,[Row](../../../../../org/apache/spark/sql/Row.html "interface in org.apache.spark.sql")> observedMetrics() * #### numInputRows public long numInputRows() The aggregate (across all sources) number of records processed in a trigger. * #### inputRowsPerSecond public double inputRowsPerSecond() The aggregate (across all sources) rate of data arriving. * #### processedRowsPerSecond public double processedRowsPerSecond() The aggregate (across all sources) rate at which Spark is processing data. * #### json public String json() The compact JSON representation of this progress. * #### prettyJson public String prettyJson() The pretty (i.e. indented) JSON representation of this progress. * #### toString public String toString() Overrides: `toString` in class `Object`