Cyble (W21)
Cyble empowers organizations with darkweb & cybercrime monitoring.

Senior Big Data Engineer

Remote / Remote
Full-time
3+ years
About Cyble

Cyble is a cyber intelligence company that empowers organizations with darkweb & cybercrime monitoring and mitigation services.

About the role

Skills: Google Cloud, Hadoop, Deep Learning

About the job

We are seeking Senior Big Data Engineers with a strong track record in data engineering with expertise wrangling petabyte-scale data and excellent academic records from premier institutes (IIT, NIT, etc.) to join our team at Cyble.

About Cyble: Cyble is a global cyber intelligence start-up backed by Y Combinator and reputed VC firms including Blackbird Ventures and Spider Capital, Xoogler Ventures, Picus Capital, and Cathexis Ventures. Cyble provides capabilities for customers to manage cyber risks with AI-powered actionable threat intelligence. We are specialists in gathering intelligence across the Deepweb, Darkweb, and Surface Web. We are a fast-growing startup recognized by Forbes along with other distinctions (https://cyble.com/press.php).

Your role:

As a Senior Big Data Engineer, you will be responsible for designing and implementing the entire data pipeline of the organization at a petabyte-scale with very low latency. You are expected to have deep technical and hands-on expertise in Big data frameworks. You use understanding of the business problem and the nature of the data to select an appropriate data management system Design and implement optimum data structures in the appropriate data management system to satisfy the data requirements Identify and select the optimum methods of access for each data source (real-time/streaming, delayed, static) Determine transformation requirements and develop processes to bring structured and unstructured data from the source to a new physical data model. Develop processes to efficiently load the transform data into the data management system Develop and code data extracts Follow standard methodologies to ensure data quality and data integrity You have a deep understanding of the following competencies

Your competencies:

Distributions: Cloudera, Datastax Enterprise, HortonWorks, Mapr Data Ingestion: Flume NG, Sqoop, Apache Kafka, Logstash Data Staging and Storage: Amazon S3, HDFS, DCFS, Google Cloud Storage, Mapr FS, Mapr XD, Mapr Object Storage Data Transformation: Hadoop Mapreduce, Pig, Spark, Logstash, AWS Lambda SQL & NoSQL: Hive, Cassandra, Hbase, Mapr DB, Drill, Elastic DB, Big Query Language: Java, Scala, Shell Scripting, Powershell, Python Orchestration: Oozie, Autosys, Jenkins, Airflow, Zookeeper Cloud Technologies: AWS, GCP, On-Prem Implementation Specialization: Cluster Design, Data Mapping, Ingestion, Modeling, Mining, Governance, Lineage, Pipeline design, Conformance, Profiling, Fabrication, Vault, Data security, Monolithic Architecture, Microservices architecture, Multitenant Model, Point to Point architecture, Hub and Spoke Model, Shared Services Model

Qualifications And Experience

Bachelor or Masters degree in Computer Science, Software Engineering, Electrical Engineering, Applied Mathematics or related field of study 4-6 years of experience developing, delivering, and/or supporting data engineering solutions at terabyte to petabyte-scale Extensive experience in search engine technologies and Elasticsearch. Experienced in developing ETL/ELT processes Significant experience with big data processing and developing applications and data sources via Hadoop, Yarn, Hive, Pig, Sqoop, MapReduce, HBASE, Flume, etc. Understanding of how distributed systems work. Familiarity with software architecture (data structures, data schemas, etc.) Strong working knowledge of databases (Oracle, MSSQL, etc.) including SQL and NoSQL. Strong mathematics background, analytical, problem-solving, and organizational skills Strong communication skills (written, verbal, and presentation)

What we bring to the table:

Competitive compensation packages. A fast-growing organization with significant growth opportunities. Fast-paced, fun, and collaborative environment. Co-workers that will care for you both personally and professionally

Technology

Our SaaS-based enterprise platform collects intelligence data in real-time across open and closed sources. This enables you to map, monitor, and mitigate your digital risk footprint.

Through a combination of our industry-leading Machine Learning capabilities and our peerless Human Analytics, we deliver actionable threat intel well before your organization is at risk.

Other jobs at Cyble

Backend Development Engineer
fulltime
Remote
Backend
3+ years
Senior Software Engineer
fulltime
Remote
Full Stack
3+ years
Senior Product Designer
fulltime
Remote
UI / UX
3+ years
Director of Product Marketing
fulltime
Remote / Remote
6+ years
Senior Big Data Engineer
fulltime
Remote / Remote
Data Science
3+ years