Getting the response and predictors

Getting the response and predictors#

We are going to applied the procedure schematized in the below diagram to build the response and predictors arrays.

Image.open('images/diagram_.jpg')
_images/1c4763bb742764589e6381cefc849d8cb5ff82d23893289f03d696b83a2d20e4.png

This process is implemented by the class FeaturesExtractionHAR and the functions get_X_features_HAR and get_y_features_HAR.

  • get_X_features_HAR needs as input a list with the raw signals matrices of each individual.

  • FeaturesExtractionHAR is a class based on get_X_features_HAR that follows the Sklearn transformers rules.

  • get_y_features_HAR needs as input a list with raw activities vectors of each individual

Building the required lists:

sampling_freq = 16
n_individuals = len(HAR_database['database_training'])

X_HAR = [HAR_database['database_training'][i][0] for i in range(n_individuals)]
Y_HAR = [HAR_database['database_training'][i][1][0] for i in range(n_individuals)]
X_HAR
[array([[ 0.02417325,  0.01990478,  0.03474859, ...,  0.15302195,
          0.16644707,  0.13097667],
        [ 0.59441693,  0.60247182,  0.52582067, ...,  1.9454901 ,
          2.00124793,  1.9890223 ],
        [-0.02273627, -0.01287489, -0.0200163 , ...,  0.00468276,
         -0.00249675, -0.00409452],
        [ 0.11196179,  0.10379559,  0.10319319, ...,  0.11359097,
          0.10181863,  0.10699662],
        [ 0.06049883,  0.05515676,  0.05754055, ...,  0.04945955,
          0.06203927,  0.06837131]]),
 array([[-0.00614456,  0.09416632,  0.02556427, ...,  0.25846636,
          0.26831659,  0.20932986],
        [ 0.15163056,  0.27388097,  0.24967994, ...,  1.01129501,
          1.00222655,  0.94156865],
        [-0.02664025, -0.02277741, -0.00710769, ..., -0.02000222,
         -0.0194218 , -0.01871779],
        [-0.06663789, -0.08298231, -0.10177238, ..., -0.03517126,
         -0.04314523, -0.04061162],
        [ 0.24180082,  0.23684254,  0.24130693, ...,  0.24131465,
          0.23414938,  0.24768909]]),
 array([[ 0.01725369,  0.10293477,  0.10819053, ...,  0.71082619,
          0.6902869 ,  0.70998775],
        [ 0.68158121,  0.69141197,  0.79265585, ...,  0.84186193,
          0.8235402 ,  0.832608  ],
        [ 0.02417523,  0.04607177,  0.03142754, ...,  0.01269511,
          0.00978901,  0.00814963],
        [-0.02421756,  0.01368092, -0.01676861, ..., -0.05873524,
         -0.04761158, -0.04766088],
        [-0.26560293, -0.25658879, -0.26265961, ..., -0.11135749,
         -0.10438768, -0.10152   ]]),
 array([[ 0.05165687,  0.0650987 ,  0.01206259, ...,  0.36555553,
          0.26527593,  0.34085031],
        [ 0.47503658,  0.55140002,  0.55872758, ...,  1.42290122,
          1.47060901,  1.4077758 ],
        [ 0.13336841,  0.14036369,  0.13732303, ..., -0.07003494,
         -0.07251649, -0.08866671],
        [-0.0668408 , -0.07284105, -0.07443974, ...,  0.04751965,
          0.04598145,  0.02407247],
        [-0.16202343, -0.17998271, -0.20182392, ..., -0.21302053,
         -0.23778129, -0.25154395]]),
 array([[-0.08256422,  0.09021634,  0.20833665, ...,  0.37585218,
          0.44981253,  0.40607189],
        [ 0.53693504,  0.51929183,  0.63409263, ...,  0.83708194,
          0.8103373 ,  0.7628346 ],
        [ 0.05353819,  0.06431957,  0.11584413, ...,  0.09322134,
          0.08182516,  0.08842621],
        [ 0.02346034,  0.00344841,  0.05940777, ...,  0.02335576,
          0.00431765,  0.02563387],
        [-0.20278839, -0.22155068, -0.31653986, ..., -0.2931038 ,
         -0.22910937, -0.18418657]]),
 array([[-0.01256814,  1.075318  , -0.8095081 , ...,  0.04081069,
          0.27214215,  0.14404119],
        [ 1.29229521,  2.2547067 ,  2.32276895, ...,  0.88879623,
          0.5977003 ,  0.69151805],
        [-0.08342359, -0.23339688, -0.4362292 , ..., -0.05970851,
         -0.10780427, -0.08269408],
        [-0.02199893,  0.03932658,  0.17018843, ...,  0.01902898,
          0.01298154, -0.01913849],
        [-0.20233026, -0.12446777, -0.09608618, ...,  0.39916403,
          0.32787735,  0.12517909]]),
 array([[ 0.06971024,  0.06352591,  0.07675852, ...,  0.27126796,
          0.16124686,  0.19524657],
        [ 0.39471738,  0.40493709,  0.43956792, ...,  1.27376087,
          1.31028354,  1.25608528],
        [-0.11659078, -0.11212781, -0.10438286, ..., -0.06724237,
         -0.08350385, -0.06953112],
        [ 0.06578621,  0.0653771 ,  0.06778922, ...,  0.09091548,
          0.09260207,  0.09608102],
        [ 0.06120889,  0.05849763,  0.05579525, ...,  0.06037664,
          0.05157136,  0.03467293]]),
 array([[-0.32805745,  0.21316135,  0.07730691, ...,  0.58396364,
          0.57248017,  0.58201507],
        [ 1.10277907,  0.94095768,  0.89902692, ...,  1.84413557,
          1.78376892,  1.81727352],
        [-0.07742709, -0.07919505, -0.08941372, ..., -0.0736645 ,
         -0.07339881, -0.08894767],
        [-0.07629876, -0.10038392, -0.10460145, ..., -0.10174   ,
         -0.08993586, -0.09181229],
        [ 0.08240638,  0.06781832,  0.06951888, ...,  0.08128702,
          0.05290613,  0.06649676]])]
Y_HAR
[array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8),
 array([3, 3, 3, ..., 3, 3, 3], dtype=uint8)]

Using a segmentation-statistics strategy#

X, X_individual, signals_block_individual = get_X_features_HAR(X_HAR=X_HAR, blocks_size=sampling_freq, statistics='mean-std', verbose=True)
Y, Y_individual, classes_block_individual = get_y_features_HAR(Y_HAR=Y_HAR, blocks_size=sampling_freq)
Time during which the activity of individual 1 of X_HAR was recorded: 18.475 mins.
Time during which the activity of individual 2 of X_HAR was recorded: 19.178 mins.
Time during which the activity of individual 3 of X_HAR was recorded: 18.167 mins.
Time during which the activity of individual 4 of X_HAR was recorded: 18.569 mins.
Time during which the activity of individual 5 of X_HAR was recorded: 17.625 mins.
Time during which the activity of individual 6 of X_HAR was recorded: 18.153 mins.
Time during which the activity of individual 7 of X_HAR was recorded: 18.808 mins.
Time during which the activity of individual 8 of X_HAR was recorded: 18.344 mins.

signals_block_individual contains the defined blocks of signals data for each individual.

signals_block_individual
  ...
  array([[-0.9535641 ,  5.54960959, -0.07485194, -0.09208762,  0.06921928],
         [-0.92491464,  5.55157934, -0.08602913, -0.09277742,  0.06688906],
         [-0.8962105 ,  5.49290386, -0.08174428, -0.0839478 ,  0.06591452],
         [-0.95288514,  5.55580542, -0.0716774 , -0.08875866,  0.06876956],
         [-0.91423   ,  5.54526606, -0.081329  , -0.08960733,  0.06445778],
         [-0.89528188,  5.47171461, -0.07622506, -0.08208996,  0.05975038],
         [-0.89141219,  5.54274991, -0.08410221, -0.08487411,  0.06084107],
         [-0.93629786,  5.51373617, -0.0842681 , -0.08512741,  0.06636033],
         [-0.96705803,  5.5260763 , -0.0833276 , -0.09228321,  0.06138676],
         [-0.9389778 ,  5.53623324, -0.07447694, -0.08578622,  0.068671  ],
         [-0.93369566,  5.55256824, -0.07629484, -0.0899456 ,  0.06754734],
         [-0.94437166,  5.55273665, -0.08122711, -0.09113733,  0.05851011],
         [-0.91352665,  5.52014708, -0.07859423, -0.09022825,  0.06744939],
         [-0.91607965,  5.5355337 , -0.08412582, -0.09034979,  0.06614856],
         [-0.9343862 ,  5.53659023, -0.07643712, -0.08472322,  0.0677591 ],
         [-0.93580094,  5.53560569, -0.07868489, -0.08562166,  0.05730293]]),
  array([[-0.90055456,  5.50864617, -0.07627973, -0.0860292 ,  0.06167482],
         [-0.92653534,  5.54668918, -0.08368662, -0.09053447,  0.05938543],
         [-0.93702774,  5.51074312, -0.06839055, -0.09325128,  0.06140467],
         [-0.959422  ,  5.56256086, -0.08130867, -0.08898674,  0.06548555],
         [-0.95321425,  5.551869  , -0.07763422, -0.09304824,  0.06171576],
         [-0.94535099,  5.57178965, -0.07406883, -0.09270522,  0.0518815 ],
         [-0.95007986,  5.53538301, -0.07618307, -0.08839877,  0.0455052 ],
         [-0.93698898,  5.55993277, -0.08161243, -0.08701398,  0.06334859],
         [-0.9238422 ,  5.54533282, -0.07210731, -0.09376708,  0.05279477],
         [-0.93602298,  5.53381804, -0.079195  , -0.09219415,  0.0500926 ],
         [-0.95499376,  5.55075285, -0.07268109, -0.08680132,  0.06033448],
         [-0.92133348,  5.55107446, -0.07462479, -0.08967334,  0.05201639],
         [-0.8999786 ,  5.56572274, -0.08031568, -0.08848783,  0.05679936],
         [-0.95171255,  5.49288661, -0.06700667, -0.08405181,  0.0629577 ],
         [-1.00657955,  5.57863802, -0.07519255, -0.0938526 ,  0.0649236 ],
         [-0.93746989,  5.56668413, -0.08118946, -0.08326364,  0.06399069]]),
  ...]}

signals_block_individual[0] contains all the blocks of signals data for the first individual.

signals_block_individual[0]
 ...
 array([[-1.41157679e+00,  5.49113572e+00,  3.44291068e-03,
          1.06197623e-01,  7.75531141e-02],
        [-1.42224570e+00,  5.48436380e+00,  1.18226670e-02,
          1.04472895e-01,  8.73067506e-02],
        [-1.40104236e+00,  5.52710150e+00,  1.88301944e-03,
          1.03488225e-01,  8.85672954e-02],
        [-1.44712496e+00,  5.53638962e+00,  1.65531128e-02,
          9.41343293e-02,  8.17513831e-02],
        [-1.40003210e+00,  5.44338058e+00,  1.16630677e-02,
          1.04349665e-01,  9.11115792e-02],
        [-1.48658998e+00,  5.63244480e+00,  1.62346028e-02,
          9.85286590e-02,  1.02221851e-01],
        [-1.47553558e+00,  5.54676784e+00,  2.81015039e-03,
          1.12765979e-01,  5.89431734e-02],
        [-1.42891074e+00,  5.54643541e+00, -2.73005986e-03,
          1.05173397e-01,  7.09235569e-02],
        [-1.40939629e+00,  5.54535642e+00, -2.37524243e-03,
          1.05340655e-01,  6.95755876e-02],
        [-1.44793733e+00,  5.54090182e+00,  1.09934707e-03,
          1.11265909e-01,  7.53285880e-02],
        [-1.44279472e+00,  5.55675357e+00, -3.78928883e-03,
          1.09030455e-01,  7.02462475e-02],
        [-1.42518648e+00,  5.57257056e+00, -2.48490679e-03,
          9.82358910e-02,  6.70371629e-02],
        [-1.47643579e+00,  5.59050342e+00, -4.36243015e-03,
          9.88542024e-02,  6.04777507e-02],
        [-1.43684525e+00,  5.54465342e+00, -9.52461946e-03,
          9.08281121e-02,  5.60087381e-02],
        [-1.41317548e+00,  5.49681737e+00, -2.59363928e-02,
          1.07734737e-01,  5.63015043e-02],
        [-1.45563435e+00,  5.55987499e+00, -1.04250344e-02,
          1.04790942e-01,  6.13796145e-02]]),
 array([[-1.39460380e+00,  5.49030499e+00, -1.69666775e-04,
          1.09775336e-01,  5.51368134e-02],
        [-1.46528076e+00,  5.53212210e+00,  3.73519515e-04,
          9.75143578e-02,  6.30618359e-02],
        [-1.41894281e+00,  5.50203588e+00, -1.34722387e-02,
          1.13882212e-01,  5.71290176e-02],
        [-1.42930337e+00,  5.53499307e+00, -4.36913113e-03,
          1.02160669e-01,  7.17726042e-02],
        [-1.43471187e+00,  5.51191566e+00, -1.53605347e-02,
          1.05782856e-01,  6.00129856e-02],
        [-1.40067929e+00,  5.50554198e+00, -1.04901340e-02,
          9.82116442e-02,  5.77522279e-02],
        [-1.39080731e+00,  5.50142206e+00, -7.96914387e-03,
          1.03072352e-01,  6.35148829e-02],
        [-1.38574907e+00,  5.50727068e+00, -9.47576741e-03,
          1.05551587e-01,  6.50442753e-02],
        [-1.42688549e+00,  5.47944194e+00, -1.08543149e-02,
          1.05105921e-01,  7.30227330e-02],
        [-1.38964184e+00,  5.46228598e+00, -6.88779954e-04,
          1.05125583e-01,  6.50120681e-02],
        [-1.37809093e+00,  5.48361479e+00, -3.90639419e-03,
          1.01320593e-01,  7.50568759e-02],
        [-1.44531111e+00,  5.52512573e+00,  5.53973443e-03,
          1.08066372e-01,  8.36645449e-02],
        [-1.38799836e+00,  5.45975483e+00,  6.63456193e-03,
          1.07703810e-01,  6.54208021e-02],
        [-1.45670058e+00,  5.51078990e+00,  1.74417191e-02,
          9.74604293e-02,  8.49360464e-02],
        [-1.40231134e+00,  5.49709688e+00,  1.85192185e-03,
          1.08070058e-01,  7.25913940e-02],
        [-1.40743121e+00,  5.55228845e+00,  4.34017189e-03,
          1.05827703e-01,  8.96557900e-02]]),
 ...]

The first individual has associated 1108 blocks, because the activity data regarding this individual was recorded during 18.475 minutes, and since each block represent one second since the blocks_size=sample_freq, then the number of blocks is approximately 18.475 (mins) * 60 (secs).

len(signals_block_individual[0])
1108
18.475 * 60
1108.5

signals_block_individual[0][0] contains the first block of signals data for the first individual.

signals_block_individual[0][0]
array([[ 2.41732537e-02,  5.94416926e-01, -2.27362738e-02,
         1.11961786e-01,  6.04988253e-02],
       [ 1.99047773e-02,  6.02471816e-01, -1.28748934e-02,
         1.03795594e-01,  5.51567615e-02],
       [ 3.47485939e-02,  5.25820668e-01, -2.00162970e-02,
         1.03193191e-01,  5.75405492e-02],
       [ 5.98320129e-02,  6.21057101e-01, -2.68414140e-02,
         1.04378202e-01,  6.38529744e-02],
       [ 4.47425524e-02,  6.53000909e-01, -1.45401225e-02,
         1.03559958e-01,  6.60838166e-02],
       [ 4.75974340e-02,  6.13626139e-01, -1.58061659e-02,
         1.11281371e-01,  8.07137578e-02],
       [ 4.27921835e-02,  5.99051778e-01, -2.92630906e-02,
         1.04612408e-01,  7.14740148e-02],
       [ 8.60722067e-02,  6.72992949e-01, -7.92528632e-02,
         9.96278016e-02,  4.63797911e-02],
       [-2.51357734e-02,  7.15842468e-01, -1.05143449e-01,
         1.01205659e-01,  9.30532899e-02],
       [ 1.81585704e-02,  8.69505899e-01, -9.50210008e-02,
         1.11931738e-01,  1.75407160e-01],
       [ 3.70359989e-02,  8.22017395e-01, -3.51349151e-02,
         1.22890757e-01,  2.00134131e-01],
       [ 6.64756007e-03,  6.46300229e-01, -3.29941144e-02,
         1.13191883e-01,  2.20028545e-01],
       [ 6.60467555e-02,  7.73661845e-01, -3.18034115e-02,
         1.07448174e-01,  1.81809689e-01],
       [ 1.08766716e-02,  7.98018119e-01,  2.86174099e-04,
         9.92467900e-02,  7.77358566e-02],
       [ 4.62967968e-02,  7.19180645e-01,  8.28934867e-03,
         9.87878476e-02, -2.87454467e-02],
       [ 1.86521130e-02,  6.54146663e-01, -1.36814545e-02,
         9.56474265e-02, -6.44420394e-02]])
signals_block_individual[0][0].shape
(16, 5)

signals_block_individual[0][2] contains the third block of signals data for the first individual.

signals_block_individual[0][2]
array([[ 0.05603331,  0.53029042,  0.01934483,  0.11045798,  0.04319037],
       [ 0.04963319,  0.51433076,  0.00830888,  0.11048016,  0.02880013],
       [ 0.02862443,  0.51042654,  0.00441286,  0.11152631,  0.02093459],
       [ 0.00882263,  0.45604613, -0.0059629 ,  0.10417648,  0.02131418],
       [ 0.01667981,  0.4761595 , -0.01400896,  0.10553487,  0.045963  ],
       [ 0.02642862,  0.4581321 , -0.01897237,  0.11587779,  0.06677644],
       [ 0.05100351,  0.45989142, -0.02613971,  0.10646909,  0.08025166],
       [ 0.05723173,  0.47876043, -0.03667004,  0.10802934,  0.09362343],
       [ 0.038007  ,  0.55108577, -0.03700678,  0.10702783,  0.11128936],
       [ 0.04394006,  0.51514546, -0.02416446,  0.11626003,  0.09354207],
       [ 0.04164859,  0.56611433, -0.02531067,  0.10683497,  0.06726321],
       [ 0.03825797,  0.53338986, -0.02089029,  0.10345215,  0.06435183],
       [ 0.04174467,  0.53574655, -0.0228516 ,  0.10338284,  0.03382761],
       [ 0.03180509,  0.55494041, -0.02121425,  0.10908425,  0.04784185],
       [ 0.03366363,  0.62921005, -0.01525349,  0.10930857,  0.04601256],
       [ 0.02781816,  0.55022712, -0.01456356,  0.10613814,  0.02238647]])
signals_block_individual[0][2].shape
(16, 5)

In the feature extraction part, statistics are computed over the columns of these blocks leading to a vector of statistics for each block.

For example, X_individual[0][2] contains that vector of statistics (meand-std in this case) computed over signals_block_individual[0][2]

X_individual[0][2]
array([ 0.0369589 ,  0.51999355, -0.01568391,  0.10837755,  0.05546055,
        0.01307035,  0.04540661,  0.01498785,  0.0037611 ,  0.02738979])

X_individual contains statistics of each signal within each block of signals data , for each individual. In this case the statistics used are the mean and std.

X_individual
{0: array([[ 0.03365261,  0.68006947, -0.03290837, ...,  0.03137443,
          0.00674869,  0.07465121],
        [ 0.02045495,  0.73399091,  0.01038365, ...,  0.02361024,
          0.00949732,  0.05903554],
        [ 0.0369589 ,  0.51999355, -0.01568391, ...,  0.01498785,
          0.0037611 ,  0.02738979],
        ...,
        [ 0.13095113,  2.04631957, -0.00542365, ...,  0.00836883,
          0.00722404,  0.02061125],
        [ 0.12373219,  2.10951935, -0.00282314, ...,  0.01206207,
          0.00446198,  0.01002007],
        [ 0.12730595,  2.04735858,  0.0047647 , ...,  0.00939929,
          0.00602465,  0.01844863]]),
 1: array([[ 0.02552456,  0.21004328, -0.01343678, ...,  0.00722181,
          0.01536981,  0.01618562],
        [ 0.02349105,  0.26255908, -0.01435662, ...,  0.00677772,
          0.0155398 ,  0.01864301],
        [ 0.0157643 ,  0.27034967, -0.02089435, ...,  0.00927781,
          0.01498527,  0.02505213],
        ...,
        [ 0.2404649 ,  0.9297338 , -0.02188205, ...,  0.00374384,
          0.00814611,  0.01080869],
        [ 0.24607284,  0.93328066, -0.02392841, ...,  0.00638742,
          0.00831582,  0.02118794],
        [ 0.24291153,  0.93360847, -0.02426387, ...,  0.00710508,
          0.00533481,  0.03115763]]),
 2: array([[ 0.10801032,  0.76176865,  0.02474844, ...,  0.0069424 ,
          0.03709814,  0.05724372],
        [ 0.05981253,  0.83088278,  0.02723256, ...,  0.03450767,
          0.02810232,  0.08711011],
        [ 0.11679964,  0.74350529,  0.05464009, ...,  0.03387063,
          0.02983833,  0.20613631],
        ...,
        [ 0.71992872,  0.85580899,  0.03798465, ...,  0.02304135,
          0.00976759,  0.04732222],
        [ 0.73835786,  0.75748506,  0.01521413, ...,  0.00685599,
          0.00782889,  0.02519091],
        [-3.87172562,  5.55654944,  0.01570616, ...,  0.32971314,
          0.67303052,  0.46222179]]),
 3: array([[ 0.04443008,  0.43412379,  0.16419157, ...,  0.01807173,
          0.00864611,  0.02158461],
        [ 0.02419495,  0.31801015,  0.16868582, ...,  0.01143476,
          0.01005742,  0.03533135],
        [ 0.03603484,  0.33835152,  0.15728174, ...,  0.0059737 ,
          0.00227381,  0.00770898],
        ...,
        [ 0.31448466,  1.39393902, -0.07519126, ...,  0.00959815,
          0.00468301,  0.01797219],
        [ 0.30890264,  1.41510699, -0.07569824, ...,  0.00456138,
          0.00476616,  0.02046283],
        [ 0.31469576,  1.42359757, -0.08132667, ...,  0.00797064,
          0.00830861,  0.01498944]]),
 4: array([[0.03371394, 0.47198146, 0.0817463 , ..., 0.02570032, 0.02738872,
         0.16433269],
        [0.02387409, 0.23705136, 0.07627262, ..., 0.03653719, 0.04914388,
         0.07821166],
        [0.05551869, 0.33990935, 0.05623541, ..., 0.05567476, 0.08830851,
         0.11550644],
        ...,
        [0.40945415, 1.08191715, 0.05734391, ..., 0.01960219, 0.01932731,
         0.1474837 ],
        [0.4055499 , 1.04583547, 0.0631354 , ..., 0.03362127, 0.03286983,
         0.53459522],
        [0.39714895, 1.12169743, 0.08458947, ..., 0.03150147, 0.02626184,
         0.557407  ]]),
 5: array([[-0.13344846,  2.07301226, -0.13882371, ...,  0.10257799,
          0.07304478,  0.06845573],
        [-0.15977253,  2.14019803, -0.12017533, ...,  0.03349518,
          0.02052632,  0.05615521],
        [-0.1618949 ,  2.1809649 , -0.15075157, ...,  0.03201487,
          0.02782587,  0.06269532],
        ...,
        [ 0.13476305,  0.76418404, -0.15004717, ...,  0.01909474,
          0.00966084,  0.03784605],
        [ 0.14829756,  0.6773758 , -0.13221036, ...,  0.01038249,
          0.008176  ,  0.01989335],
        [ 0.15518139,  0.6740382 , -0.12196696, ...,  0.02631454,
          0.02744556,  0.20666895]]),
 6: array([[ 0.06998025,  0.38400484, -0.1049222 , ...,  0.00659321,
          0.00613909,  0.01220299],
        [ 0.0665933 ,  0.36126145, -0.10261548, ...,  0.00473752,
          0.00752425,  0.01153158],
        [ 0.06857345,  0.3784264 , -0.10048799, ...,  0.00606604,
          0.00565864,  0.02474517],
        ...,
        [ 0.22535568,  0.91850084, -0.05600098, ...,  0.01095303,
          0.01484306,  0.02475246],
        [ 0.2325681 ,  0.94931234, -0.07971868, ...,  0.01772623,
          0.03476652,  0.03522503],
        [ 0.20799562,  1.21781196, -0.07500503, ...,  0.00845219,
          0.01024371,  0.02757259]]),
 7: array([[ 0.056215  ,  0.95812313, -0.07279302, ...,  0.01828723,
          0.02289719,  0.0222955 ],
        [ 0.06243653,  0.82328174, -0.07302929, ...,  0.02840644,
          0.03581865,  0.111916  ],
        [ 0.04178255,  0.75221462, -0.07660493, ...,  0.01675104,
          0.02213708,  0.03087231],
        ...,
        [ 0.57247732,  1.83496397, -0.08331397, ...,  0.00445178,
          0.00591461,  0.00764645],
        [ 0.56999553,  1.84273059, -0.07963554, ...,  0.00479656,
          0.00320455,  0.00933258],
        [ 0.5736782 ,  1.84422192, -0.08041299, ...,  0.00561236,
          0.00471552,  0.00715709]])}

X_individual[i] contains the signals stats for each block in which signals data of i-th individual was segmented. Each block represent 1 second in this case since blocks_size=sampling_freq.

X_individual[0]
array([[ 0.03365261,  0.68006947, -0.03290837, ...,  0.03137443,
         0.00674869,  0.07465121],
       [ 0.02045495,  0.73399091,  0.01038365, ...,  0.02361024,
         0.00949732,  0.05903554],
       [ 0.0369589 ,  0.51999355, -0.01568391, ...,  0.01498785,
         0.0037611 ,  0.02738979],
       ...,
       [ 0.13095113,  2.04631957, -0.00542365, ...,  0.00836883,
         0.00722404,  0.02061125],
       [ 0.12373219,  2.10951935, -0.00282314, ...,  0.01206207,
         0.00446198,  0.01002007],
       [ 0.12730595,  2.04735858,  0.0047647 , ...,  0.00939929,
         0.00602465,  0.01844863]])
X_individual[0].shape
(1108, 10)
X_individual[1]
array([[ 0.02552456,  0.21004328, -0.01343678, ...,  0.00722181,
         0.01536981,  0.01618562],
       [ 0.02349105,  0.26255908, -0.01435662, ...,  0.00677772,
         0.0155398 ,  0.01864301],
       [ 0.0157643 ,  0.27034967, -0.02089435, ...,  0.00927781,
         0.01498527,  0.02505213],
       ...,
       [ 0.2404649 ,  0.9297338 , -0.02188205, ...,  0.00374384,
         0.00814611,  0.01080869],
       [ 0.24607284,  0.93328066, -0.02392841, ...,  0.00638742,
         0.00831582,  0.02118794],
       [ 0.24291153,  0.93360847, -0.02426387, ...,  0.00710508,
         0.00533481,  0.03115763]])
X_individual[1].shape
(1150, 10)

X_individual[0] and X_individual[1] has a different number of rows since the recording times for individual 0 and 1 are not the same.

Signals were recorded during 18.48 minutes for individual 1, while were 19.18 minutes for individual 2. This cause that difference in the number of blocks, and therefore ein the number of rows.

X[i][j,:] contains the signals stats for the fir the j-th second of data recorded for the i-th individual.

For example, the first individual was recorded 18.48 minutes, so, we have 16 observations of the signals for each second of those 18.48, since the sampling frequency was 16 Hz. Then we have segmented that data in blog of a size equal to the sampling frequency, that is, blocks of 16 observations (rows). Then statistics like the mean and standard deviation are computed within each block (by columns). Then, each block has been transformed in a signals stats vector, which represent the signals stats of the individual in a given second of the recording. The signals stats vectors will be the data observations.

A similar procedure was applied to the activities data, but using the mode to transform the blocks into vectors. Therefore, the most frequent activity within a block will represent that block, and will be considered the response observation associated to the specific second linked to the block.

Examples:

  • X_individual[2][15,:] is the signals stats vector of the individual 3 in the second 16 of the recording.

  • Y_individual[2][15] is the most frequent activity done by individual 3 in the second 16 of the recording.

X_individual[2][15,:]
array([ 0.09538123,  0.83600318,  0.02143199, -0.0570608 , -0.09876293,
        0.01846191,  0.03842564,  0.00916219,  0.01059745,  0.0141913 ])
Y_individual[2][15]
3

X is the result of concatenating the arrays of X_individual by rows.

Therefore X is predictors matrix where each row contains the signals statistics for a certain individual in a specific second. These rows represent observations, and he columns, that contain a signal statistics for each second of the recorded data for each available individual, represent the predictor variables.

X
array([[ 0.03365261,  0.68006947, -0.03290837, ...,  0.03137443,
         0.00674869,  0.07465121],
       [ 0.02045495,  0.73399091,  0.01038365, ...,  0.02361024,
         0.00949732,  0.05903554],
       [ 0.0369589 ,  0.51999355, -0.01568391, ...,  0.01498785,
         0.0037611 ,  0.02738979],
       ...,
       [ 0.57247732,  1.83496397, -0.08331397, ...,  0.00445178,
         0.00591461,  0.00764645],
       [ 0.56999553,  1.84273059, -0.07963554, ...,  0.00479656,
         0.00320455,  0.00933258],
       [ 0.5736782 ,  1.84422192, -0.08041299, ...,  0.00561236,
         0.00471552,  0.00715709]])
X.shape
(8836, 10)

Another way to obtain the predictors matrix X is using the class FeaturesExtractionHAR that follows the Sklearn rules for transformers.

features_extraction_HAR = FeaturesExtractionHAR(blocks_size=sampling_freq, statistics='mean-std')
X = features_extraction_HAR.fit_transform(X=X_HAR)
X
array([[ 0.03365261,  0.68006947, -0.03290837, ...,  0.03137443,
         0.00674869,  0.07465121],
       [ 0.02045495,  0.73399091,  0.01038365, ...,  0.02361024,
         0.00949732,  0.05903554],
       [ 0.0369589 ,  0.51999355, -0.01568391, ...,  0.01498785,
         0.0037611 ,  0.02738979],
       ...,
       [ 0.57247732,  1.83496397, -0.08331397, ...,  0.00445178,
         0.00591461,  0.00764645],
       [ 0.56999553,  1.84273059, -0.07963554, ...,  0.00479656,
         0.00320455,  0.00933258],
       [ 0.5736782 ,  1.84422192, -0.08041299, ...,  0.00561236,
         0.00471552,  0.00715709]])
X.shape
(8836, 10)

Y is the result of concatenating the arrays of Y_person to form a single vector.

Therefore Y is response vector where each component indicates the activity that the certain individual was doing in an specific second

Y
array([3, 3, 3, ..., 3, 3, 3], dtype=uint8)
Y.shape
(8836,)