From monoliths to clusters: transforming ALA s infrastructure

Similar documents
Consortium of Pacific Northwest Herbaria. Ben Legler University of Washington

From paper to cloud: elevating text from specimens and field notes to the web.

Secure Your Way of Life. Now Compa ble With. Climax Home Portal Platform. Enable a Connected Future

GE Security. Proven. Powerful. Integrated fire and life safety solutions.

Different types of Fire Alarm System

TPS Virtualization and Future Virtual Developments Paul Hodge, Honeywell

Enterprise GIS Architecture Deployment Options

Calcolatori, Internet e il Web

THE VALUE IS IN THE SOFTWARE

Web Services are based on Apache and Tomcat servers. The html/jsp (tags, beans) are fully customisable and extendable.

Enterprise Service Bus

ASSA ABLOY Hospitality Mobile Access

Compact Product Suite Compact HMI 6.0 Overview ABB

UltraSync Modular Hub

ASSA ABLOY Hospitality Mobile Access

Why and How to Migrate from Oracle Reports to JasperReports [WEBINAR]

FALCO Access Control. Product Training

Skyresponse ThingWorx Extension. Version 1.0. User Guide

Using the PNW Herbaria web site to study Washington s flora: tools, tips and tricks

NEW FEATURE GUIDE BioStar 2.1 English Version 1.00

DynAMo Alarm & Operations Management

USER S GUIDE. AXIS Cross Line Detection

GeoPlanner SM for ArcGIS : An Introduction

Moving water, moving forward. Integrating global expertise with local knowledge

Designing the Right Access Control System A Case Study. 12 th May 2010

Experion LX Direct Station Specification

ORVIBO BROCHURE خانه هوشمند فرااسمارت ORVIBO INTERNATIONAL INC Lundy Avenue Suite 216 San Jose CA United States

Facility Commander Wnx

Update - Home Improvement

Remote Pump Monitoring

Salusfin Smart heating control: Installation Guide

OnGuard 7.1 Resolved Issues

Contents. What is Paxton10? 3. How does it work? 4. Paxton10 hardware 5. Paxton10 software 8. Paxton10 portal 9. Useful information 10

Fike Safety y Technology Ltd

Using ANM Mobile CHAPTER

Integrated security management platform for Windows. Seamless. Effective. Efficient.

Rapid Site Solution. commtelns.com

Serving Fremont Employees with a One Stop Shop Enterprise Web App

GE Security. Wnx. Facility Commander. Integrated security management platform for Windows. Seamless. Effective. Efficient.

Oracle Real Application Clusters (RAC) 12c Release 2 What s Next?

CX/CXB Self-Managed Ethernet Switches N-664 Network Appliance Data and Alarm I/O Device MX Series Multi-Port Switches

STRIPEYFISH. Utilities for VMware Series. sfvalarms User Guide

Fire Safety Awareness

Protecting Network Data

2014 South Atlantic LCC

Intelligent Keys. A smart solution for recurring revenue

Getting started with

Perimeter Product Overview. Effective protection for your business

Moving water, moving forward. Integrating global expertise with local knowledge

UD-VMS510i. Surveillance Management Center

Moving to the Cloud: The Potential of Hosted Central Station Services

IPPBX NURSE CALL SYSTEM (NCS)

Lowering Customer Effort Building Effective Right-Channeling into your Online Service Experience

Vision for Mayfair and Belgravia

European Location Framework (ELF), INSPIRE and the European Interoperability Framework (EIF)

OnGuard 7.2 Resolved Issues

CODAC Core System 5.4 CS-Studio Release Notes

TRIDENT GS SERIES Central Chiller

The Surveyor's Role in Developing a Sustainable Society. XXIV FIG International Congress. 15 th April Overview

Product Brief for Alarm Analytics V9.3 October 2013

Thailand Technology Summit 2016

FIRE SAFETY MANAGEMENT PLAN

LightSweep. Lighting Control System

EQ-ROBO Programming : Cliff Check Robot

Practical Home Automation

Serverless IoT at irobot. Ben Cloud Robotics Research Scientist

HAWK 8000 CONNECTED BUILDING MANAGEMENT

VISEUM UK PRESENTATION

Creating 3D Interior Maps for Campus Planning

Devna Pankhania & Nigel Tapper 1. School of Earth, Atmosphere and Environment 1.

User & Installer Manual SMT-400 "Enterprise" Wi-Fi Thermostat

A Smart & Integrated Security solution

Affiliate brand guidelines Version 1.0

RADIOLOGICAL SURVEILLANCE & ALARM MANAGEMENT SYSTEMS

Hard rock mining solutions

Today s world has more risk than ever. Today s G4S has more ways to protect you.

Manual & Technical Documentation V1.1

Interfacing with the ADAM Module

HMI of the Future with Experion Orion Console and Experion Collaboration Station Andrew Stuart & Annemarie Diepenbroek, Honeywell

ArcGIS Online at Philadelphia Water Department

This is your cloud. Paris OpenStack Summit

The product segments Heating, ventilation, air conditioning. Control Products & Systems OEM.

VTScadaLIGHT for Small Water & Wastewater Systems

Sync. Smart Home Alarm


NU2 Smart Power Supply (SPS) Manual

Local Placemaking Opportunities

Workshop on the Management of Historic Urban Landscapes of the XXth century, December 2007 Chandigarh, India

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

smart modules An integrated and customised video analytics solution for the real-time events detection

Au ra (noun): 1. A distinctive but intangible quality that seems to surround a thing; an atmosphere.

Proliphix. Remote Management. User Guide

Living Cities Workshop Wednesday February 10th, 2016 Parliament House, Canberra

Australian Standard. Functional safety Safety instrumented systems for the process industry sector

Micrel Case Study: Reducing Costs for Industrial Gas Cylinders Management 14 January 2008

PUBLIC ENGAGEMENT & SIX THEMES OF THE PLAN

Design. Innovations in Integrated Control Systems

Answering your needs VIDEO DOOR ENTRY SYSTEMS

HikCentral Web Client. User Manual

Gas Value Chain Solutions. Randy Miller

Transcription:

From monoliths to clusters: transforming ALA s infrastructure Matt Andrews DevOps Manager 18 October 2018 The ALA is made possible by contributions from its many partners. It receives support through the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS) and is hosted by CSIRO.

From monoliths to clusters Matt Andrews 2

From monoliths to clusters Matt Andrews 3

From monoliths to clusters Matt Andrews 4

From monoliths to clusters Matt Andrews 5

From monoliths to clusters Matt Andrews 6

From monoliths to clusters Matt Andrews 7

From monoliths to clusters Matt Andrews 8

From monoliths to clusters Matt Andrews 9

Atlas core services BIE species search, taxonomy Biocache occurrences Collectory natural history collections DOI data downloads for publication Spatial portal spatial analysis Species lists User authentication From monoliths to clusters Matt Andrews 10

From monoliths to clusters Matt Andrews 11

Atlas usage Over the last year: 1.1 million user sessions 345,000 unique visitors 85% from Australia From monoliths to clusters Matt Andrews 12

From monoliths to clusters Matt Andrews 13

From monoliths to clusters Matt Andrews 14

Occurrence records An event: observation of a species Over 78 million in the ALA Typically ~1000 fields per record Over 120,000 species From monoliths to clusters Matt Andrews 15

More ALA stats Over 400 spatial layers Over 1 million images Over 1.5 million downloads Over 6000 datasets Over 40,000 users From monoliths to clusters Matt Andrews 16

Atlas core services BIE species search, taxonomy Biocache occurrences Collectory natural history collections DOI data downloads for publication Spatial portal spatial analysis Species lists User authentication From monoliths to clusters Matt Andrews 17

From monoliths to clusters Matt Andrews 18

From monoliths to clusters Matt Andrews 19

From monoliths to clusters Matt Andrews 20

From monoliths to clusters Matt Andrews 21

From monoliths to clusters Matt Andrews 22

From monoliths to clusters Matt Andrews 23

From monoliths to clusters Matt Andrews 24

From monoliths to clusters Matt Andrews 25

From monoliths to clusters Matt Andrews 26

From monoliths to clusters Matt Andrews 27

From monoliths to clusters Matt Andrews 28

From monoliths to clusters Matt Andrews 29

From monoliths to clusters Matt Andrews 30

From monoliths to clusters Matt Andrews 31

From monoliths to clusters Matt Andrews 32

Biocache data store: Apache Cassandra search: Apache Solr biocache applications From monoliths to clusters Matt Andrews 33

Biocache data store: Apache Cassandra search: Apache Solr biocache applications: biocache-hub: front end (browsers) biocache-service: web services biocache-store: data handling From monoliths to clusters Matt Andrews 34

Old biocache one Cassandra server one Solr server one biocache-hub server a pair of servers for biocache-service biocache-store with Cassandra From monoliths to clusters Matt Andrews 35

Data cycle data quality checks initial import into Cassandra processing environmental sampling fully processed records in Cassandra generate indexes for Solr From monoliths to clusters Matt Andrews 36

Old biocache any maintenance caused downtime any server issue caused downtime reaching hard limits on data size data cycle extremely slow and brittle From monoliths to clusters Matt Andrews 37

Old biocache any maintenance caused downtime any server issue caused downtime reaching hard limits on data size data cycle extremely slow and brittle: processing and sampling: six days indexing: 24 hours From monoliths to clusters Matt Andrews 38

Old biocache any maintenance caused downtime any server issue caused downtime reaching hard limits on data size data cycle extremely slow and brittle: processing and sampling: 6 days indexing: 24 hours no scalability From monoliths to clusters Matt Andrews 39

Designing a solution From monoliths to clusters Matt Andrews 40

Designing a solution latest versions of Cassandra and Solr Cassandra cluster Solr Cloud cluster applications in load balanced pools From monoliths to clusters Matt Andrews 41

IO improvements Cassandra from 1 to 4 nodes From monoliths to clusters Matt Andrews 42

IO improvements Cassandra from 1 to 4 nodes biocache-store to 4 separate servers From monoliths to clusters Matt Andrews 43

Building the solution Cassandra: single-node version 1 to clustered version 3 Solr: single-node version 4 to clustered version 6 From monoliths to clusters Matt Andrews 44

From monoliths to clusters Matt Andrews 45

Initial UK version Cassandra: four nodes Solr: eight nodes Fast, fault-tolerant data ingestion Larger number of occurrences, but much smaller width (number of fields) From monoliths to clusters Matt Andrews 46

AU version development Around 18 months elapsed, 2-5 developers Fewer occurrence records, but much wider data (more fields) Needed to maintain backwards compatibility for most types of request From monoliths to clusters Matt Andrews 47

AU version development Around 18 months elapsed, 2-5 developers Fewer occurrence records, but much wider data (more fields) Needed to maintain backwards compatibility for most types of request Iterative configuration testing in multiple components From monoliths to clusters Matt Andrews 48

AU version development Performance initially much lower than legacy Eventually got performance to similar level Costs slightly higher Switched to new version in July 2018 From monoliths to clusters Matt Andrews 49

Data cycle changes Processing and sampling: Old system - 6 days, often repeated due to low fault tolerance From monoliths to clusters Matt Andrews 50

Data cycle changes Processing and sampling: Old system - 6 days, often repeated due to low fault tolerance New system 11 hours, highly fault tolerant From monoliths to clusters Matt Andrews 51

Data cycle changes Processing and sampling: Old system - 6 days, often repeated due to low fault tolerance New system 11 hours, highly fault tolerant Indexing: Old system 24 hours From monoliths to clusters Matt Andrews 52

Data cycle changes Processing and sampling: Old system - 6 days, often repeated due to low fault tolerance New system 11 hours, highly fault tolerant Indexing: Old system 24 hours New system 3.5 hours Now indexing every few days From monoliths to clusters Matt Andrews 53

Robustness clustered Cassandra and Solr can now tolerate having a node go down biocache applications (biocache-hub and biocache-service) now pooled behind load balancers, can tolerate a server going down From monoliths to clusters Matt Andrews 54

Scalability expanding Cassandra or Solr capacity is relatively straightforward expecting to be able to handle future major expansion of data width (species traits) From monoliths to clusters Matt Andrews 55

Biocache data store: Apache Cassandra search: Apache Solr biocache applications From monoliths to clusters Matt Andrews 56

Thank you Matt Andrews DevOps Manager Atlas of Living Australia matt.andrews@csiro.au The ALA is made possible by contributions from its many partners. It receives support through the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS) and is hosted by CSIRO.