Episodes

  • This story was originally published on HackerNoon at: https://hackernoon.com/how-im-building-an-ai-for-analytics-service.
    In this article I want to share my experience with developing an AI service for a web analytics platform called Swetrix.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-analysis, #ai, #analytics, #website-traffic, #software-architecture, #machine-learning, #predictive-analytics, #hackernoon-top-story, and more.

    This story was written by: @pro1code1hack. Learn more about this writer by checking @pro1code1hack's about page, and for more stories, please visit hackernoon.com.

    In this article I want to share my experience with developing an AI service for a web analytics platform, called Swetrix. My aim was to develop a machine learning model that would predict future website traffic based on the data displayed on the following screenshot. The end goal is to have a clear vision for the customer of what traffic will appear on their website in the future.

  • This story was originally published on HackerNoon at: https://hackernoon.com/real-time-anomaly-detection-in-underwater-gliders-experimental-evaluation.
    This paper presents a real-time anomaly detection algorithm to enhance underwater glider safety using datasets from actual deployments.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-analysis, #machine-learning, #underwater-gliders, #anomaly-detection, #oceanography, #glider-navigation, #ocean-data, #marine-robotics, and more.

    This story was written by: @oceanography. Learn more about this writer by checking @oceanography's about page, and for more stories, please visit hackernoon.com.

    We apply the anomaly detection algorithm to four glider deployments across the coastal ocean of Florida and Georgia, USA. For evaluation, the anomaly detected by the algorithm is cross-validated by high-resolution glider DBD data and pilot notes. We simulate the online detection process on SBD and compare the result with that detected from DBD.

  • Missing episodes?

    Click here to refresh the feed.

  • This story was originally published on HackerNoon at: https://hackernoon.com/real-time-anomaly-detection-in-underwater-gliders-abstract-and-intro.
    This paper presents a real-time anomaly detection algorithm to enhance underwater glider safety, using datasets from actual deployments.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-analysis, #machine-learning, #underwater-gliders, #anomaly-detection, #oceanography, #glider-navigation, #ocean-data, #marine-robotics, and more.

    This story was written by: @oceanography. Learn more about this writer by checking @oceanography's about page, and for more stories, please visit hackernoon.com.

    Underwater gliders are widely used in oceanography for a range of applications. However, unpredictable events like shark strikes or remora attachments can lead to abnormal glider behavior or even loss of the instrument. This paper employs an anomaly detection algorithm to assess operational conditions of underwater gliders in the real-world ocean environment. Prompt alerts are provided to glider pilots upon detecting any anomaly.

  • This story was originally published on HackerNoon at: https://hackernoon.com/the-power-of-universal-semantic-layers-insights-from-cube-co-founder-artyom-keydunov.
    What is a universal semantic layer, and how is it different from a semantic layer? Is there actual semantics involved? Who uses that, how, and what for?
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #analytics, #business-intelligence, #data-modeling, #data-integration, #knowledge-graph, #universal-semantic-layer, #data-visualization, #data-warehouses, and more.

    This story was written by: @linked_do. Learn more about this writer by checking @linked_do's about page, and for more stories, please visit hackernoon.com.

    What is a universal semantic layer, and how is it different from a semantic layer? Is there actual semantics involved? Who uses that, how, and what for?

  • This story was originally published on HackerNoon at: https://hackernoon.com/a-comprehensive-guide-to-building-dolphinscheduler-320-production-grade-cluster-deployment.
    In version 3.2.0, DolphinScheduler introduces a series of new features and improvements, significantly enhancing its stability.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #workflow-management, #opensource, #programming, #dolphinscheduler, #apache-dolphinscheduler, #big-data, #cluster-deployment, and more.

    This story was written by: @zhoujieguang. Learn more about this writer by checking @zhoujieguang's about page, and for more stories, please visit hackernoon.com.

    DolphinScheduler provides powerful workflow management and scheduling capabilities for data engineers by simplifying complex task dependencies. In version 3.2.0, DolphinScheduler introduces a series of new features and improvements, significantly enhancing its stability and availability in production environments.

  • This story was originally published on HackerNoon at: https://hackernoon.com/why-monitoring-a-distributed-database-more-complex-than-you-might-expect.
    Why is monitoring a distributed database more complex than you might expect
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #distributed-databases, #opentelemetry, #monitor-distributed-databases, #apache-ignite, #monitoring-systems, #database-monitoring-challenges, #data-acquisition-models, #push-vs-pull-acquisition, and more.

    This story was written by: @ingvard. Learn more about this writer by checking @ingvard's about page, and for more stories, please visit hackernoon.com.

    In this article, we will dive into the complexities of monitoring distributed databases from the perspective of a monitoring system developer. I will try to cover the following topics: managing multiple nodes, network restrictions, and issues related to high throughput caused by a large number of metrics.

  • This story was originally published on HackerNoon at: https://hackernoon.com/outlier-detection-what-you-need-to-know.
    Decisions are usually based on the sample mean, which is very sensitive to outliers and can dramatically change the value. So, it is crucial to manage outliers
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #outlier-detection, #statistics, #python3, #variance-reducing, #what-is-outlier-detection, #bootstrap, #problem-formulation, #data-analysis, and more.

    This story was written by: @nataliaogneva. Learn more about this writer by checking @nataliaogneva's about page, and for more stories, please visit hackernoon.com.

    Analysts often encounter outliers in data during their work. Decisions are usually based on the sample mean, which is very sensitive to outliers. It is crucial to manage outliers to make the correct decision. Let's consider several simple and fast approaches for working with unusual values.

  • This story was originally published on HackerNoon at: https://hackernoon.com/instrument-variables-and-ab-testing-part-1.
    This article explores the Mathematical details of least squares estimator in an unbiased and biased settings due to model specification errors.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #linear-regression, #ab-testing, #casual-analysis, #bias, #multicollinearity, #confounding, #instrument-variables, #instrument-variables-math, and more.

    This story was written by: @varunnakra1. Learn more about this writer by checking @varunnakra1's about page, and for more stories, please visit hackernoon.com.

    This article explores the Mathematical details of least squares estimator in an unbiased and biased settings due to model specification errors.

  • This story was originally published on HackerNoon at: https://hackernoon.com/using-arrow-flight-sql-protocol-in-apache-doris-21-for-super-fast-data-transfer.
    Apache Doris 2.1 just got a major speed boost with Arrow Flight SQL for up to 10x faster data transfers.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #database, #data-engineering, #pandas, #python, #apache-arrow, #big-data, #data, and more.

    This story was written by: @frankzzz. Learn more about this writer by checking @frankzzz's about page, and for more stories, please visit hackernoon.com.

    Apache Doris 2.1 supports Arrow Flight SQL protocol for reading data from Doris. It delivers tens-fold speedups compared to PyMySQL and Pandas.

  • This story was originally published on HackerNoon at: https://hackernoon.com/data-science-for-portfolio-optimization-markowitz-mean-variance-theory.
    The theory formulates a mathematical model to optimize the asset allocations to gain the maximum return for a given risk-level.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #asset-management, #modern-portfolio-theory, #portfolio-optimization, #markowtiz-mean-variance, #what-is-the-markowitz-theory, #portfolio-theory, #investment-portfolio-tips, and more.

    This story was written by: @kustarev. Learn more about this writer by checking @kustarev's about page, and for more stories, please visit hackernoon.com.

    An investment portfolio comprises various assets such as stocks and bonds. Every investor starts with a fixed investment capital and decides how much to invest in each asset. Data science techniques such as the Markowitz mean-variance theory help determine the optimal share allocation to build the optimal portfolio.The theory formulates a mathematical model to optimize the asset allocations to gain the maximum return for a given risk-level. It analyzes different financial assets and considers their rate of return and risk factors, given their historical trends. The rate of return is an approximation of how much profit the asset will generate over a given time period. The risk factor is quantified using the standard deviation of the asset value. A higher deviation represents a volatile asset and, hence, higher risk.The return and risk values are calculated for various portfolio combinations and are represented on the efficient frontier curve. The curve helps investors determine the highest returns against their selected risk.

  • This story was originally published on HackerNoon at: https://hackernoon.com/10-best-datasets-for-time-series-analysis.
    In order to understand how a certain metric varies over time and to predict future values, we will look at the 10 Best Datasets for Time Series Analysis.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #datasets, #ai, #data-science, #data-analysis, #data-analytics, #data-visualization, #hackernoon-datasets, #machine-learning, and more.

    This story was written by: @datasets. Learn more about this writer by checking @datasets's about page, and for more stories, please visit hackernoon.com.

    Time series data is essentially a collection of data points organized in time. Time is frequently the independent variable, and the purpose is usually to forecast the future in time series. In this article, we will look at the *10 Best Datasets for Time Series Analysis,* in order to understand how a certain metric varies over time.

  • This story was originally published on HackerNoon at: https://hackernoon.com/understanding-scaling-law-through-data-science-lenses.
    Despite the immense promise of LMs, initial endeavors to apply pre-trained LMs to downstream tasks have encountered significant challenges.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #scaling-law, #artificial-intelligence, #tokenizing-raw-text, #discrete-tokens, #embedding-vector, #token-embeddings, #power-laws, and more.

    This story was written by: @tianchengxu. Learn more about this writer by checking @tianchengxu's about page, and for more stories, please visit hackernoon.com.

    Despite the immense promise of LMs as task-neutral foundation models, initial endeavors to apply pre-trained LMs to downstream tasks encountered significant challenges.

  • This story was originally published on HackerNoon at: https://hackernoon.com/why-postgresql-is-the-bedrock-for-the-future-of-data.
    Explore the rise of PostgreSQL as the de facto database standard, its impact on software development, and the key trends driving its widespread adoption.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-management, #postgresql, #timescale, #future-of-data-science, #software-development, #database-complexity, #timescaledb, #good-company, #hackernoon-es, #hackernoon-hi, #hackernoon-zh, #hackernoon-fr, #hackernoon-bn, #hackernoon-ru, #hackernoon-vi, #hackernoon-pt, #hackernoon-ja, #hackernoon-de, #hackernoon-ko, #hackernoon-tr, and more.

    This story was written by: @timescale. Learn more about this writer by checking @timescale's about page, and for more stories, please visit hackernoon.com.

    PostgreSQL's ascendancy as the go-to database standard is rooted in its adaptability, reliability, and extensive ecosystem. This article delves into the reasons behind its dominance, from tackling database complexity to empowering developers to build the future with confidence. Discover how PostgreSQL is revolutionizing software development and data management practices.

  • This story was originally published on HackerNoon at: https://hackernoon.com/how-to-implement-multi-group-bar-chart-and-interact-with-highlighting-by-grouping-dimension.
    Solution for implementing a multi-group bar chart similar to the following figure:Expect two groups to differentiate in style through color transparency.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #visualization, #visactor, #vchart, #multi-group-bar-chart, #grouping-dimensions, #data-science-guide, #vchart-guide, #vchart-tutorial, and more.

    This story was written by: @hacker5022841. Learn more about this writer by checking @hacker5022841's about page, and for more stories, please visit hackernoon.com.

    Solution for implementing a multi-group bar chart similar to the following figure:Expect two groups to differentiate in style through color transparency.When the mouse hovers over a column block, all blocks of the same color are highlighted in linkage.

  • This story was originally published on HackerNoon at: https://hackernoon.com/unlocking-the-invaluable-role-of-big-data-in-modern-supply-chain-management.
    Let’s take a deeper look at the scale of impact the big data revolution can have for global supply chains and vendor management.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #big-data, #supply-chain, #big-data-analytics, #management, #supply-chain-ai, #supply-chain-management, #data-revolution, #supply-chain-efficiency, and more.

    This story was written by: @dmytrospilka. Learn more about this writer by checking @dmytrospilka's about page, and for more stories, please visit hackernoon.com.

    Let’s take a deeper look at the scale of impact the big data revolution can have for global supply chains and vendor management.

  • This story was originally published on HackerNoon at: https://hackernoon.com/advantages-and-disadvantages-of-big-data.
    Big data may seem like any other buzzword in business, but it’s important to understand how big data benefits a company and how it’s limited.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #big-data, #data-science, #data-analysis, #data-analytics, #big-data-analytics, #advantages-of-big-data, #disadvantages-of-big-data, and more.

    This story was written by: @devinpartida. Learn more about this writer by checking @devinpartida's about page, and for more stories, please visit hackernoon.com.

    Big data may seem like any other buzzword in business. Still, it’s important to understand how big data benefits a company and how it’s limited. If a company uses big data to its advantage, it can be a major boon for them and help them outperform its competitors. Advantages include improved decision making, reduced costs, increased productivity and enhanced customer service. Disadvantages include cybersecurity risks, talent gaps and compliance complications.

  • This story was originally published on HackerNoon at: https://hackernoon.com/top-16-types-of-chart-in-data-visualization-hrh32wv.
    In the era of information explosion, more and more data piles up. However, these dense data are unfocused and less readable. So we need data visualization to help data to be easily understood and accepted. By contrast, visualization is more intuitive and meaningful, and it is very important to use appropriate charts to visualize data.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-visualization, #big-data, #latest-tech-stories, #hackernoon-top-story, #data-representation, #how-to-visualize-data, #column-chart-vs-bar-chart, #area-chart-vs-pie-chart, #hackernoon-es, and more.

    This story was written by: @sage. Learn more about this writer by checking @sage's about page, and for more stories, please visit hackernoon.com.

    The Top 16 Types of Charts in Data Visualization That You'll Use: Column Chart, Bar Chart, Scatter Plot, Bubble Chart, Radar Chart and Bubble Chart. All the charts in the article are taken from the data visualization tool FineReport. The chart is used to show the change of data over a continuous time interval or time span. It is characterized by a tendency to reflect things as they change over time or ordered categories. It should be noted that the number of data records of the line graph should be greater than 2.

  • This story was originally published on HackerNoon at: https://hackernoon.com/data-in-ai-a-deep-dive-with-jerome-pasquero.
    How is Data Transforming AI - The What's AI Podcast (episode 27)
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data, #big-data, #data-security, #ai, #artificial-intelligence, #machine-learning, #jerome-pasquero, #what's-ai, and more.

    This story was written by: @whatsai. Learn more about this writer by checking @whatsai's about page, and for more stories, please visit hackernoon.com.

    This week's episode of the What's AI podcast features Machine Learning Director Jerome Pasquero. We discussed the role of human judgment in data annotation. We also touched on the often subtle yet significant presence of AI in our daily routines. This episode is a must for anyone curious about the ways in which data fuels AI.

  • This story was originally published on HackerNoon at: https://hackernoon.com/14-best-tableau-datasets-for-practicing-data-visualization.
    This article focuses on the 14 Best Tableau Datasets for Practicing Data Visualization, which is essential for business analysts and data scientists.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #tableau, #data, #datasets, #covid-19-datasets, #data-visualization, #data-visualization-tools, #data-analysis, #tableau-vs-powerbi, and more.

    This story was written by: @datasets. Learn more about this writer by checking @datasets's about page, and for more stories, please visit hackernoon.com.

    Tableau is a data analysis and visualization tool that enables users to connect, visualize and share data in an easy-to-understand and meaningful way. This article focuses on the 14 Best Tableau Datasets for Practicing Data Visualization, essential for helping you gain valuable experience.

  • This story was originally published on HackerNoon at: https://hackernoon.com/the-lifecycle-of-a-data-warehouse.
    We're about to embark on the fascinating journey of building a data warehouse, guided by our adept Data Architect.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-warehouse, #business-intelligence, #databases, #cloud-storage, #etl, #olap, #database-management, #relational-database, and more.

    This story was written by: @ishaanraj. Learn more about this writer by checking @ishaanraj's about page, and for more stories, please visit hackernoon.com.

    A data warehouse, optimized for OLAP (Online Analytical Processing), is a centralized repository for structured and processed data.Unlike traditional OLTP (Online Transaction Processing) systems, it's designed for efficient querying and reporting. The use of columnar storage in data warehouses allows for quicker data retrieval, especially beneficial for analytical queries.