Data

Four types of data are considered in the OGCID project

  • The detector geometry. It describes all the available sensors in the detectors. Every sensor has a unique id (cellID) and three coordinates designating its barycenter coordinates.
  • The events. Every event is a set of measurements associated with a sensor and with variables to regress (labelled data). Every event has a unique id (eventID) and label data (categorie or numerical values to regress). An event is composed of measurement composed of a cellID and various measurement (energy deposit, timing…). Thus, an event is basically a point cloud with features (measurements) associated to each point. Each point is called a hit.
  • The graph events. In order to build graphs from the events, every hit will be considered as a node of the graph. The edges, which have no physical meaning, have to be added and featured with an algorithm. Thus the graph event is a valued graph which has as many node as there are hits, each of them is valued with the measurements and edges are added and can be featured with geometrical informations (coordinates, distance, angle…).
  • The proximity table. In order to create the graph, it is necessary to add edges between the nodes. Theses nodes are the hits. The K-nearest-neighbors algorithm is oftenly used but is quadratic in complexity (mean and worst). The idea behind this project is to optimize the graph construction by exploiting the fact that sensors are fixed in space. Thus, a table can be built that contains all the potential neighbors, sorted by a probability to be the next neighbor. This has the effect to reduce the mean complexity of the graph generation operation. The proximity table contains lists of sorted neighbors.

Two datasets are used for the OGCID project.