Once the CSV file is loaded into memory, we can split the columns of data into input and output variables.
The data will be stored in a 2D array where the first dimension is rows and the second dimension is columns, e.g. [rows, columns].
We can split the array into two arrays by selecting subsets of columns using the standard NumPy slice operator or “:” We can select the first 8 columns from index 0 to index 7 via the slice 0:8. We can then select the output column (the 9th variable) via index 8.
# load the dataset dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',') # split into input (X) and output (y) variables X = dataset[:,0:8] y = dataset[:,8]