Identification of Aberrant Railroad Wayside WILD and THD Detectors: Using Industry-wide Railroad Data June 10, 2014 1
Agenda Background / Overview Solution Approach Groupings Criteria Sites Speed Weight Car Type Solution Architecture (with SAS) REPORTING: Dashboards to Analyze WILD Sites 2
WILD Detectors WILD: WHEEL IMPACT LOAD DETECTOR 3
WILD Detectors Data Collection Railinc TTX 4
WILD Detector Locations 158 WILD detectors in N. A. TTX receives 375,000 records per day. 5
Project Overview Wayside Detectors Business Problem Goals Located at about 158 WILD sites in the North America Detect Wheel impacts Can exhibit aberrant behavior (e.g. due to flooding etc.) Aberrant detector can lead to wrong detection and unnecessary wheel repair expenses An aberrant detector could lead to removal of 200-500 healthy wheels (costing $2-3K each) before it is identified (cost = $1M+) Once suspected, it takes at least 2-3 weeks of manual work to analyze data and verify it Build an interactive tool for rapid identification of aberrant wayside detectors / health diagnostics using INDUSTRY-WIDE DATA Evaluate readings grouped by Speed, Weight and Car type for better precision Advanced Enterprise Reporting system & Insights (Dashboards/UI, email alerts) 6
TTX-CARS ONLY WILD DATA PREVIOUS WORK/BACKGROUND 7
Variables affecting Alerts from Detectors Factors that influence Number of Alerts from a Detector Variability of measurement system Alerts differ across time (seasonality, winter temperature) Alerts differ across detectors (railcar types, location) Metrics APW: Alert per 10,000 wheels passed Defined for each alert level
Variability is Normal in WILD Detectors 9
1/1/2009 1/31/2009 3/2/2009 4/1/2009 5/1/2009 5/31/2009 6/30/2009 7/30/2009 8/29/2009 9/28/2009 10/28/2009 11/27/2009 12/27/2009 1/26/2010 2/25/2010 3/27/2010 4/26/2010 5/26/2010 6/25/2010 7/25/2010 8/24/2010 9/23/2010 10/23/2010 11/22/2010 12/22/2010 1/21/2011 2/20/2011 3/22/2011 4/21/2011 5/21/2011 6/20/2011 7/20/2011 8/19/2011 9/18/2011 10/18/2011 11/17/2011 12/17/2011 Temperature / Seasonal variation* 4 3.5 2009 T = -14 C 2011 T = -13 C APW 3 2.5 2 1.5 2010 T = -9 C 2012 T = -7 C 1 0.5 0 More WILD alerts during winter months Colder winter leads to more alerts *TTX-CARS ONLY WILD DATA 10
Grouping*» Example of grouping sites based on similar environmental criteria not by rail road. 11
Grouping* Used to handle seasonality and temperature related effects APW 35 Group 1 35 Group 2 30 30 25 25 20 20 15 15 10 10 5 5 0 0 *TTX-CARS ONLY WILD DATA 12
Detector Specific Variations* APW differs from detector to detector 12 10 APW 8 6 4 2 0 *TTX-CARS ONLY WILD DATA 13
Normalized APW Normalization for comparison across sites 1-Jan-10 10-Feb-10 22-Mar-10 1-May-10 10-Jun-10 20-Jul-10 29-Aug-10 8-Oct-10 17-Nov-10 27-Dec-10 5-Feb-11 17-Mar-11 26-Apr-11 5-Jun-11 15-Jul-11 24-Aug-11 3-Oct-11 12-Nov-11 22-Dec-11 31-Jan-12 11-Mar-12 20-Apr-12 30-May-12 1-Jan-10 13-Feb-10 28-Mar-10 10-May-10 22-Jun-10 4-Aug-10 16-Sep-10 29-Oct-10 11-Dec-10 23-Jan-11 7-Mar-11 19-Apr-11 1-Jun-11 14-Jul-11 26-Aug-11 8-Oct-11 20-Nov-11 2-Jan-12 14-Feb-12 28-Mar-12 10-May-12 35 30 APW 6 5 APW 3-year Average 25 4 20 3 15 2 10 5 1 0 0 *TTX-CARS ONLY WILD DATA 14
1-Jan-10 21-Jan-10 10-Feb-10 2-Mar-10 22-Mar-10 11-Apr-10 1-May-10 21-May- 10-Jun-10 30-Jun-10 20-Jul-10 9-Aug-10 29-Aug-10 18-Sep-10 8-Oct-10 28-Oct-10 17-Nov-10 7-Dec-10 27-Dec-10 16-Jan-11 5-Feb-11 25-Feb-11 17-Mar-11 6-Apr-11 26-Apr-11 16-May- 5-Jun-11 25-Jun-11 15-Jul-11 4-Aug-11 24-Aug-11 13-Sep-11 3-Oct-11 23-Oct-11 12-Nov-11 2-Dec-11 22-Dec-11 11-Jan-12 31-Jan-12 20-Feb-12 11-Mar-12 31-Mar-12 20-Apr-12 10-May- 30-May- Normalized APW Group Average and Standard Deviation 6 5 4 Mean ± std. dev. 3 2 1 0 *TTX-CARS ONLY WILD DATA 15
Group Average is the Benchmark Normalized APW 1-Jan-10 21-Jan-10 10-Feb-10 2-Mar-10 22-Mar-10 11-Apr-10 1-May-10 21-May- 10-Jun-10 30-Jun-10 20-Jul-10 9-Aug-10 29-Aug-10 18-Sep-10 8-Oct-10 28-Oct-10 17-Nov-10 7-Dec-10 27-Dec-10 16-Jan-11 5-Feb-11 25-Feb-11 17-Mar-11 6-Apr-11 26-Apr-11 16-May- 5-Jun-11 25-Jun-11 15-Jul-11 4-Aug-11 24-Aug-11 13-Sep-11 3-Oct-11 23-Oct-11 12-Nov-11 2-Dec-11 22-Dec-11 11-Jan-12 31-Jan-12 20-Feb-12 11-Mar-12 31-Mar-12 20-Apr-12 10-May- 30-May- 6 5 4 Mean ± std. dev. 3 2 1 0 *TTX-CARS ONLY WILD DATA 16
Normalized APW Detector Scoring 6 SCORE: How far from the average is aberrant? Yellow: Score > 2 Red: Score > 3 5 4 Score= Observed Group Avg Std.Dev. 3 2 1 0 *TTX-CARS ONLY WILD DATA 17
Example of Normal site Normalized APW 4 3.5 3 2.5 2 1.5 1 0.5 0 *TTX-CARS ONLY WILD DATA 18
Example of Faulty Detector (1) Normalized APW 6 5 4 3 2 1 0 *TTX-CARS ONLY WILD DATA 19
WILD INDUSTRY WIDE DATA 20
Site Groupings* (Industry-Wide Data) *Site Groupings is for Industry-wide data and is based on similarity in APW/Base APW (= S) trends, domain knowledge, geography and environmental criteria. 21
Comparison of Data Size Average Wheel Count Per Day for 2012 All Cars ~ 1,650,000 Approx. 4X more data TTX Only ~ 375,000-200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 Industry Wide Data Volume ~ 1.5-2M rows of data PER DAY (~ 200 MB size) ~ 200-220 GB of data over last three years Stripped Car Numbers and Car Initials for anonymity purposes 22
Normalized APW Solution Approach Separating the signal from the noise - typical site 6 5 4 3 1 3 Compute S = APW / Base APW Determine benchmark = Group average 4 Score= APW = Alerts Wheels 10000 Observed Group Avg Std.Dev. Green: Score < 2 Yellow: Score > 2 Red: Score > 3 2 1 0 2 ---------- Benchmark Trend Detector Trend One time Group sites based on similarity in APW/Base APW (= S) trends, domain knowledge, geography 23
WILD TTX-ONLY CARS / INDUSTRY WIDE DATA COMPARISON 24
Parameter S TTX Cars vs. All Cars Group 1 (Site A) The lines form a very tight band (8/15/2011) 65% higher standard deviation. * For Alert Level 1, Site Direction 1 25
Example from Group 6 (Site B) 01-Jan-10 15-Feb-10 01-Apr-10 15-May-10 01-Jul-10 15-Aug-10 01-Oct-10 15-Nov-10 01-Jan-11 15-Feb-11 01-Apr-11 15-May-11 01-Jul-11 15-Aug-11 01-Oct-11 15-Nov-11 01-Jan-12 15-Feb-12 01-Apr-12 01-Jan-10 15-Feb-10 01-Apr-10 15-May-10 01-Jul-10 15-Aug-10 01-Oct-10 15-Nov-10 01-Jan-11 15-Feb-11 01-Apr-11 15-May-11 01-Jul-11 15-Aug-11 01-Oct-11 15-Nov-11 01-Jan-12 15-Feb-12 01-Apr-12 Normalized APW for Detector 4 TTX Cars * All Cars * 3.5 3.5 4 3 2.5 2 1.5 1 0.5 0 3 2.5 2 1.5 1 0.5 0 (8/15/2011) 360% higher standard deviation. 26
WILD ANALYSIS FOR SPEED, WEIGHT AND CAR TYPE FACTORS
Site D Track 1 All Cars (1) Comparison of Observed APW for Site D Track 1 By Speed* Higher APW at higher speeds APW * For Alert Level 1 and Site Direction 1 28
Site D Track 1 All Cars (2) Z-score Comparison of Z-Score for Site D Track 1 By Speed* Not Faulty at lower speeds Comparison of Data: Sample Data Point - May 1 st,2011 * Speed 30 Day Alerts 30 Day Wheels Alerts / 10k Wheels 0 40 mph 379 88,496 43 40 60 mph 5,608 288,855 194 60 80 mph 42 8,760 48 * For Alert Level 1 and Site Direction 1 29
Weight Bucket Analysis- # of Wheels & Alerts # of Wheels # of Alerts Mid-range (160,000 240,000 lbs) is further split into buckets of 20,000 lbs each 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 0 149,854 Empty(0-160,000 lbs) Avg. # of Wheels per Month per Site 6,196 4,323 4,350 8,422 Mid(160,000-180,000 lbs) Mid(180,000-200,000 lbs) Mid(200,000-220,000 lbs) Mid(220,000-240,000 lbs) 105,460 Fully Loaded(240,000-320,000 lbs) 157 Above Fully Loaded(320,000+ lbs) Avg. # of Condemnable Alerts Per Month Per Site 400 300 200 100 0 7 4 4 6 14 Empty(0-160,000 lbs) Mid(160,000-180,000 lbs) Mid(180,000-200,000 lbs) Mid(200,000-220,000 lbs) Mid(220,000-240,000 lbs) 298 Fully Loaded(240,000-320,000 lbs) < 1 Above Fully Loaded(320,000+ lbs) * 2010-13 data for all sites 30
Site D Track 1 Hopper & Tank Cars Comparison of Data: Sample Data Point - May 1 st,2011 * Car Weight 30 Day Alerts 30 Day Wheels Alerts per 10,000 Wheels (APW) % of Total Wheels Empty (0 160,000 lbs) Mid (160,000 240,000 lbs) 6 67,148 1 50.3% 18 1,423 126 1.1% Fully Loaded (240,000 320,000 lbs) 4,041 64,685 625 48.5% Above Fully Loaded (320,000 lbs +) 11 117 940 0.1% * For Alert Level 1, Site Direction 1, Speed Range 40 60 mph 31
Site D Track 1 Empty Versus Fully Loaded Comparison of Z-Score for Site D Track 1 By Weight* * For Alert Level 1, Site Direction 1, Speed Range 40 60 mph, Hopper & Tank Cars 32
Final Groupings Summary for All Data Speed Group SPEED CAR TYPE WEIGHT Train Speed (mph) S1 0-40 S2 40-60 S3 60-80 S4 80+ 40 60 mph is the most stable speed range Speeds lower than 40 mph do not produce sufficient load impact The detector readings become unstable at higher speeds Group C1 C2 C3 C4 C5 C6 C7 C8 C9 Car Umler Type Box Cars Covered Hopper / Tank Cars Gondolas Flat Cars Equipped & Unequipped Hoppers / Gondolas-GT Refrigerator Cars Vehicular Flat Cars Locomotives Conventional, Intermodal & Stack Cars X* Caboose, Passenger, Containers, Trailers, Special Types Weight Group W1 Car Weight 0-160,000 lbs W2 160,000-240,000 lbs W3 240,000-320,000 lbs W4 320,000 lbs + W1: Fully Empty / Nearly Empty W3: Fully Loaded / Almost Fully Loaded (+-10%)
WILD PERFORMANCE OF CAR GROUPS 34
Site D Track 1 By Car Type Comparison of Data: Sample Data Point Jan 1 st,2012 * Car Type 30 Day Alerts 30 Day Wheels APW % Total Wheels Flat Cars 56 1,112 503 1% Hopper & Tank Cars 2007 67,386 298 76% Gondolas 374 10,560 354 12% Vehicular Flat Cars Box Cars & Refrigerator Cars Stacked & Intermodal Cars 0 0 0 0% 23 616 373 1% 34 9,071 37 10% * For Alert Level 1, Site Direction 1, Speed Range 40 60 mph & Weight Range 240,000-320,000 lbs 35
Site D Track 1 By Car Type SAME DAY Faulty Not Faulty * For Alert Level 1, Site Direction 1, Speed Range 40 60 mph & Weight Range 240,000 320,000 lbs 36
SOLUTION ARCHITECTURE 37
Investigation 1 Aberrant Site 38
Investigation 1 Site Dashboard 39
Investigation1 Comparison of Rails Rail 1 Rail 2 Both rails are above the group average with Z-scores > 3. 40
Investigation 1 Comparison of Cars Covered hoppers Open hoppers Gondolas All car types representing 90% of the traffic had high alerts. 41
Investigation 2 No trouble found Railroad asked TCCI to investigate a detector giving a large number of alerts. No, the Z-score is 1.8. All mid west detectors were higher due to the long, cold winter. The traffic pattern changed. A lot more cars More weight per car 42
Conclusions Change the state of the art 1. From gut feel to clear statistical signals. 2. Weekly, automate analysis of 11.5 million signals from 158 detectors, and send notifications of abnormal readings. 3. Rapid process to investigate suspicious detectors. 43