Loading Events

« All Events

  • This event has passed.

Seattle Data Science: Apache Spark Lightning Talks

January 17 @ 6:30 pm - 8:30 pm


This event will feature a series of brief, engaging lightning talks with data scientists discussing Apache Spark.

Speakers keep their presentations under 20 minutes, and allow the audience to ask questions about their presentation for 15 minutes.

Who are these talks for?

These lightning talks are for anyone with a strong personal or professional interest in data science, data engineering, and/or Apache Spark. Beginners are welcome!

Why Spark?

Apache Spark is a powerful, open source processing engine for data distributed across large clusters. Spark is optimized for speed and ease of use; it uses caching and memory to run distributed algorithms up to 100x faster than MapReduce. Spark can be used for batch processing and for processing data in near real-time.IMG_5739.jpg

Washington Supreme Court Opinions

David Valpey, Data Scientist

Our legal system depends on knowledge of what came before. Anyone working within our legal system must navigate a large amount of data. David describes his use of Python, Apache Spark, Numpy, NLTK, and BeautifulSoup to extract key features of Washington State Supreme Court opinions and navigate them by similarity.

David is a data scientist with a background in computer science in linguistics.

Album Recommendation Web AppIMG_5830.jpg

Sal Khan, Data Scientist

Pandora and other popular music services recommend individual songs based on their musical characteristics. Because he prefers listening to entire albums, Sal built an album recommendation system using Spark ML. He also deployed it as a live web application using Flask and UWSGI.

Sal is a data scientist with a background in consulting and business analytics.


Mining Reviews for Product FeaturesIMG_5822.jpg

Rob Dalton, Data Scientist

Amazon and other online shopping sites provide information on product quality in the form of customer reviews. Rob Dalton has built a PySpark application that mines Amazon product reviews, extracting the most criticized features and the most praised features of each product. This tool can help customers save time spent reading full reviews, and it can help companies identify potential product defects.

Rob is a data scientist with a background in management consulting and web development.


About Galvanize
Galvanize is the premiere dynamic learning community for technology. With campuses located in booming technology sectors throughout the country, Galvanize provides a community for each the following:
Education – part-time and full-time training in web development, data science, and data engineering
Workspace – whether you’re a freelancer, startup, or established business, we provide beautiful spaces with a community dedicated to support your company’s growth
Networking – events in the tech industry happen constantly in our campuses, ranging from popular Meetups to multi-day international conferences
To learn more about Galvanize, visit galvanize.com.
To learn more about our data science initiatives, please visit this link: http://www.galvanize.com/data-science/


111 South Jackson Street
Seattle , WA 98104 US
+ Google Map


Galvanize Seattle