Interactive Table Visualisation¶
In this notebook, you’ll learn how to use UrbanMapper to load a CSV file and visualise it interactively using the TableVisMixin
class. This is a great way to explore your urban data dynamically thanks to Skrub
table viz that you can see at: https://skrub-data.org/stable/.
Let’s dive in! 🚀
Step 1: Initialising UrbanMapper¶
We’ll start by creating an instance of UrbanMapper
. This sets up the environment for loading your CSV data.
import urban_mapper as um
# Initialise UrbanMapper
mapper = um.UrbanMapper()
Step 3: Loading Your CSV Data¶
Now, we’ll load your CSV file using UrbanMapper’s loader. Replace "<path>"
with the actual path to your CSV file. We’ll specify longitude and latitude columns to prepare the data for geospatial use. Change appropriately.
# Load CSV data (replace '<path>' with your file path)
csv_loader = mapper.loader.from_huggingface("oscur/taxisvis1M", number_of_rows=1000, streaming=True).with_columns("pickup_latitude", "pickup_longitude")
data = csv_loader.load()
data.head() # Preview the first few rows
VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | pickup_longitude | pickup_latitude | RateCodeID | store_and_fwd_flag | dropoff_longitude | dropoff_latitude | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | 2015-01-15 19:05:39 | 2015-01-15 19:23:42 | 1 | 1.59 | -73.993896 | 40.750111 | 1 | N | -73.974785 | 40.750618 | 1 | 12.0 | 1.0 | 0.5 | 3.25 | 0.0 | 0.3 | 17.05 | POINT (40.75011 -73.9939) |
1 | 1 | 2015-01-10 20:33:38 | 2015-01-10 20:53:28 | 1 | 3.30 | -74.001648 | 40.724243 | 1 | N | -73.994415 | 40.759109 | 1 | 14.5 | 0.5 | 0.5 | 2.00 | 0.0 | 0.3 | 17.80 | POINT (40.72424 -74.00165) |
2 | 1 | 2015-01-10 20:33:38 | 2015-01-10 20:43:41 | 1 | 1.80 | -73.963341 | 40.802788 | 1 | N | -73.951820 | 40.824413 | 2 | 9.5 | 0.5 | 0.5 | 0.00 | 0.0 | 0.3 | 10.80 | POINT (40.80279 -73.96334) |
3 | 1 | 2015-01-10 20:33:39 | 2015-01-10 20:35:31 | 1 | 0.50 | -74.009087 | 40.713818 | 1 | N | -74.004326 | 40.719986 | 2 | 3.5 | 0.5 | 0.5 | 0.00 | 0.0 | 0.3 | 4.80 | POINT (40.71382 -74.00909) |
4 | 1 | 2015-01-10 20:33:39 | 2015-01-10 20:52:58 | 1 | 3.00 | -73.971176 | 40.762428 | 1 | N | -74.004181 | 40.742653 | 2 | 15.0 | 0.5 | 0.5 | 0.00 | 0.0 | 0.3 | 16.30 | POINT (40.76243 -73.97118) |
Step 4: Displaying the Table Interactively¶
With your data loaded, let’s use TableVisMixin
to create an interactive table. This will allow you to sort, filter, and explore the data dynamically. We’ll display the first 10 rows, sorted by longitude.
Click on some features / columns and use the nice interactive viz by Skrub.
# Create an instance of TableVisMixin
vis = mapper.table_vis.interactive_display(
dataframe=data,
n_rows=10,
title="Interactive Urban Data Report",
verbose=1
)
vis
Processing column 1 / 20
Processing column 2 / 20
Processing column 3 / 20
Processing column 4 / 20
Processing column 5 / 20
Processing column 6 / 20
Processing column 7 / 20
Processing column 8 / 20
Processing column 9 / 20
Processing column 10 / 20
Processing column 11 / 20
Processing column 12 / 20
Processing column 13 / 20
Processing column 14 / 20
Processing column 15 / 20
Processing column 16 / 20
Processing column 17 / 20
Processing column 18 / 20
Processing column 19 / 20
Processing column 20 / 20
Interactive Urban Data Report
VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | pickup_longitude | pickup_latitude | RateCodeID | store_and_fwd_flag | dropoff_longitude | dropoff_latitude | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | 2015-01-15 19:05:39 | 2015-01-15 19:23:42 | 1 | 1.59 | -73.993896484375 | 40.7501106262207 | 1 | N | -73.97478485107422 | 40.75061798095703 | 1 | 12.0 | 1.0 | 0.5 | 3.25 | 0.0 | 0.3 | 17.05 | POINT (40.7501106262207 -73.993896484375) |
1 | 1 | 2015-01-10 20:33:38 | 2015-01-10 20:53:28 | 1 | 3.3 | -74.00164794921875 | 40.7242431640625 | 1 | N | -73.99441528320312 | 40.75910949707031 | 1 | 14.5 | 0.5 | 0.5 | 2.0 | 0.0 | 0.3 | 17.8 | POINT (40.7242431640625 -74.00164794921875) |
2 | 1 | 2015-01-10 20:33:38 | 2015-01-10 20:43:41 | 1 | 1.8 | -73.96334075927734 | 40.80278778076172 | 1 | N | -73.95182037353516 | 40.82441329956055 | 2 | 9.5 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 10.8 | POINT (40.80278778076172 -73.96334075927734) |
3 | 1 | 2015-01-10 20:33:39 | 2015-01-10 20:35:31 | 1 | 0.5 | -74.00908660888672 | 40.71381759643555 | 1 | N | -74.00432586669922 | 40.71998596191406 | 2 | 3.5 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 4.8 | POINT (40.71381759643555 -74.00908660888672) |
4 | 1 | 2015-01-10 20:33:39 | 2015-01-10 20:52:58 | 1 | 3.0 | -73.97117614746094 | 40.762428283691406 | 1 | N | -74.00418090820312 | 40.742652893066406 | 2 | 15.0 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 16.3 | POINT (40.762428283691406 -73.97117614746094) |
995 | 1 | 2015-01-07 20:40:04 | 2015-01-07 20:43:01 | 1 | 0.5 | -73.99357604980469 | 40.74185943603515 | 1 | N | -73.99305725097656 | 40.74549102783203 | 1 | 4.0 | 0.5 | 0.5 | 1.59 | 0.0 | 0.3 | 6.89 | POINT (40.74185943603515 -73.99357604980469) |
996 | 1 | 2015-01-07 20:40:04 | 2015-01-07 21:02:43 | 2 | 4.2 | -74.00177764892578 | 40.73946762084961 | 1 | N | -73.95372772216797 | 40.767356872558594 | 1 | 17.5 | 0.5 | 0.5 | 2.0 | 0.0 | 0.3 | 20.8 | POINT (40.73946762084961 -74.00177764892578) |
997 | 1 | 2015-01-07 20:40:04 | 2015-01-07 20:47:36 | 2 | 1.2 | -74.00238037109375 | 40.73783493041992 | 1 | N | -73.98587799072266 | 40.72793960571289 | 2 | 7.0 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 8.3 | POINT (40.73783493041992 -74.00238037109375) |
998 | 1 | 2015-01-07 20:40:05 | 2015-01-07 21:04:15 | 1 | 4.5 | -73.98628997802734 | 40.75269317626953 | 1 | N | -73.9522933959961 | 40.808624267578125 | 1 | 20.0 | 0.5 | 0.5 | 4.26 | 0.0 | 0.3 | 25.56 | POINT (40.75269317626953 -73.98628997802734) |
999 | 1 | 2015-01-07 20:40:05 | 2015-01-07 20:50:35 | 2 | 2.6 | -73.9290771484375 | 40.75468826293945 | 1 | N | -73.95439147949219 | 40.73670959472656 | 2 | 10.5 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 11.8 | POINT (40.75468826293945 -73.9290771484375) |
VendorID
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 1.45 ± 0.498
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 2
tpep_pickup_datetime
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 329 (32.9%)
Most frequent values
tpep_dropoff_datetime
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 989 (98.9%)
Most frequent values
passenger_count
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 6 (0.6%)
- Mean ± Std
- 1.55 ± 1.16
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 6
trip_distance
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 356 (35.6%)
- Mean ± Std
- 2.78 ± 3.12
- Median ± IQR
- 1.70 ± 2.20
- Min | Max
- 0.00 | 28.7
pickup_longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 924 (92.4%)
- Mean ± Std
- -72.6 ± 9.84
- Median ± IQR
- -74.0 ± 0.0270
- Min | Max
- -74.0 | 0.00
pickup_latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 958 (95.8%)
- Mean ± Std
- 40.0 ± 5.42
- Median ± IQR
- 40.8 ± 0.0340
- Min | Max
- 0.00 | 40.8
RateCodeID
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 5 (0.5%)
- Mean ± Std
- 1.04 ± 0.343
- Median ± IQR
- 1 ± 0
- Min | Max
- 1 | 5
store_and_fwd_flag
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
Most frequent values
dropoff_longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 934 (93.4%)
- Mean ± Std
- -72.6 ± 10.1
- Median ± IQR
- -74.0 ± 0.0322
- Min | Max
- -74.2 | 0.00
dropoff_latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 961 (96.1%)
- Mean ± Std
- 40.0 ± 5.57
- Median ± IQR
- 40.8 ± 0.0367
- Min | Max
- 0.00 | 41.0
payment_type
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (0.3%)
- Mean ± Std
- 1.38 ± 0.497
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 3
fare_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 94 (9.4%)
- Mean ± Std
- 12.3 ± 9.61
- Median ± IQR
- 9.50 ± 8.00
- Min | Max
- 0.00 | 83.5
extra
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (0.3%)
- Mean ± Std
- 0.383 ± 0.372
- Median ± IQR
- 0.500 ± 0.500
- Min | Max
- 0.00 | 1.00
mta_tax
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 0.497 ± 0.0386
- Median ± IQR
- 0.500 ± 0.00
- Min | Max
- 0.00 | 0.500
tip_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 186 (18.6%)
- Mean ± Std
- 1.63 ± 2.25
- Median ± IQR
- 1.00 ± 2.25
- Min | Max
- 0.00 | 20.0
tolls_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 6 (0.6%)
- Mean ± Std
- 0.221 ± 1.25
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 16.0
improvement_surcharge
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 0.276 ± 0.0810
- Median ± IQR
- 0.300 ± 0.00
- Min | Max
- 0.00 | 0.300
total_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 298 (29.8%)
- Mean ± Std
- 15.3 ± 11.8
- Median ± IQR
- 11.8 ± 9.10
- Min | Max
- 0.300 | 98.6
geometry
GeometryDtype- Null values
- 0 (0.0%)
- Unique values
- 983 (98.3%)
Most frequent values
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | VendorID | Int64DType | 0 (0.0%) | 2 (0.2%) | 1.45 | 0.498 | 1 | 1 | 2 |
1 | tpep_pickup_datetime | ObjectDType | 0 (0.0%) | 329 (32.9%) | |||||
2 | tpep_dropoff_datetime | ObjectDType | 0 (0.0%) | 989 (98.9%) | |||||
3 | passenger_count | Int64DType | 0 (0.0%) | 6 (0.6%) | 1.55 | 1.16 | 1 | 1 | 6 |
4 | trip_distance | Float64DType | 0 (0.0%) | 356 (35.6%) | 2.78 | 3.12 | 0.00 | 1.70 | 28.7 |
5 | pickup_longitude | Float64DType | 0 (0.0%) | 924 (92.4%) | -72.6 | 9.84 | -74.0 | -74.0 | 0.00 |
6 | pickup_latitude | Float64DType | 0 (0.0%) | 958 (95.8%) | 40.0 | 5.42 | 0.00 | 40.8 | 40.8 |
7 | RateCodeID | Int64DType | 0 (0.0%) | 5 (0.5%) | 1.04 | 0.343 | 1 | 1 | 5 |
8 | store_and_fwd_flag | ObjectDType | 0 (0.0%) | 2 (0.2%) | |||||
9 | dropoff_longitude | Float64DType | 0 (0.0%) | 934 (93.4%) | -72.6 | 10.1 | -74.2 | -74.0 | 0.00 |
10 | dropoff_latitude | Float64DType | 0 (0.0%) | 961 (96.1%) | 40.0 | 5.57 | 0.00 | 40.8 | 41.0 |
11 | payment_type | Int64DType | 0 (0.0%) | 3 (0.3%) | 1.38 | 0.497 | 1 | 1 | 3 |
12 | fare_amount | Float64DType | 0 (0.0%) | 94 (9.4%) | 12.3 | 9.61 | 0.00 | 9.50 | 83.5 |
13 | extra | Float64DType | 0 (0.0%) | 3 (0.3%) | 0.383 | 0.372 | 0.00 | 0.500 | 1.00 |
14 | mta_tax | Float64DType | 0 (0.0%) | 2 (0.2%) | 0.497 | 0.0386 | 0.00 | 0.500 | 0.500 |
15 | tip_amount | Float64DType | 0 (0.0%) | 186 (18.6%) | 1.63 | 2.25 | 0.00 | 1.00 | 20.0 |
16 | tolls_amount | Float64DType | 0 (0.0%) | 6 (0.6%) | 0.221 | 1.25 | 0.00 | 0.00 | 16.0 |
17 | improvement_surcharge | Float64DType | 0 (0.0%) | 2 (0.2%) | 0.276 | 0.0810 | 0.00 | 0.300 | 0.300 |
18 | total_amount | Float64DType | 0 (0.0%) | 298 (29.8%) | 15.3 | 11.8 | 0.300 | 11.8 | 98.6 |
19 | geometry | GeometryDtype | 0 (0.0%) | 983 (98.3%) |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
VendorID
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 1.45 ± 0.498
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 2
tpep_pickup_datetime
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 329 (32.9%)
Most frequent values
tpep_dropoff_datetime
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 989 (98.9%)
Most frequent values
passenger_count
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 6 (0.6%)
- Mean ± Std
- 1.55 ± 1.16
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 6
trip_distance
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 356 (35.6%)
- Mean ± Std
- 2.78 ± 3.12
- Median ± IQR
- 1.70 ± 2.20
- Min | Max
- 0.00 | 28.7
pickup_longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 924 (92.4%)
- Mean ± Std
- -72.6 ± 9.84
- Median ± IQR
- -74.0 ± 0.0270
- Min | Max
- -74.0 | 0.00
pickup_latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 958 (95.8%)
- Mean ± Std
- 40.0 ± 5.42
- Median ± IQR
- 40.8 ± 0.0340
- Min | Max
- 0.00 | 40.8
RateCodeID
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 5 (0.5%)
- Mean ± Std
- 1.04 ± 0.343
- Median ± IQR
- 1 ± 0
- Min | Max
- 1 | 5
store_and_fwd_flag
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
Most frequent values
dropoff_longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 934 (93.4%)
- Mean ± Std
- -72.6 ± 10.1
- Median ± IQR
- -74.0 ± 0.0322
- Min | Max
- -74.2 | 0.00
dropoff_latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 961 (96.1%)
- Mean ± Std
- 40.0 ± 5.57
- Median ± IQR
- 40.8 ± 0.0367
- Min | Max
- 0.00 | 41.0
payment_type
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (0.3%)
- Mean ± Std
- 1.38 ± 0.497
- Median ± IQR
- 1 ± 1
- Min | Max
- 1 | 3
fare_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 94 (9.4%)
- Mean ± Std
- 12.3 ± 9.61
- Median ± IQR
- 9.50 ± 8.00
- Min | Max
- 0.00 | 83.5
extra
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (0.3%)
- Mean ± Std
- 0.383 ± 0.372
- Median ± IQR
- 0.500 ± 0.500
- Min | Max
- 0.00 | 1.00
mta_tax
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 0.497 ± 0.0386
- Median ± IQR
- 0.500 ± 0.00
- Min | Max
- 0.00 | 0.500
tip_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 186 (18.6%)
- Mean ± Std
- 1.63 ± 2.25
- Median ± IQR
- 1.00 ± 2.25
- Min | Max
- 0.00 | 20.0
tolls_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 6 (0.6%)
- Mean ± Std
- 0.221 ± 1.25
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 16.0
improvement_surcharge
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (0.2%)
- Mean ± Std
- 0.276 ± 0.0810
- Median ± IQR
- 0.300 ± 0.00
- Min | Max
- 0.00 | 0.300
total_amount
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 298 (29.8%)
- Mean ± Std
- 15.3 ± 11.8
- Median ± IQR
- 11.8 ± 9.10
- Min | Max
- 0.300 | 98.6
geometry
GeometryDtype- Null values
- 0 (0.0%)
- Unique values
- 983 (98.3%)
Most frequent values
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V | Pearson's Correlation |
---|---|---|---|
dropoff_longitude | dropoff_latitude | 1.00 | -1.00 |
pickup_longitude | geometry | 1.00 | |
pickup_latitude | geometry | 1.00 | |
pickup_longitude | pickup_latitude | 1.00 | -1.00 |
dropoff_longitude | geometry | 0.918 | |
dropoff_latitude | geometry | 0.918 | |
pickup_latitude | dropoff_longitude | 0.918 | -0.918 |
pickup_latitude | dropoff_latitude | 0.918 | 0.918 |
pickup_longitude | dropoff_longitude | 0.918 | 0.918 |
pickup_longitude | dropoff_latitude | 0.918 | -0.918 |
RateCodeID | mta_tax | 0.894 | -0.671 |
fare_amount | mta_tax | 0.780 | -0.322 |
RateCodeID | fare_amount | 0.720 | 0.487 |
mta_tax | tolls_amount | 0.706 | -0.458 |
fare_amount | total_amount | 0.699 | 0.983 |
mta_tax | total_amount | 0.672 | -0.348 |
trip_distance | fare_amount | 0.669 | 0.897 |
RateCodeID | total_amount | 0.657 | 0.467 |
tolls_amount | total_amount | 0.656 | 0.608 |
fare_amount | tolls_amount | 0.653 | 0.539 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Wrapping Up¶
That’s it! 🎈 You’ve successfully loaded your CSV data with UrbanMapper and visualised it interactively using TableVisMixin
. This interactive display makes it easy to explore your dataset. Feel free to tweak the n_rows
, order_by
, or other parameters to customise the view!