Skip to content

Commit 30765ce

Browse files
committed
Repo maintenance - Outdated links and minor things
1 parent 46fc95c commit 30765ce

17 files changed

+22
-21
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.idea

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,6 @@ Learn more about Flink at https://flink.apache.org/.
5353

5454
## License
5555

56-
Copyright © 2020 Ververica GmbH
56+
Copyright © 2020-2021 Ververica GmbH
5757

5858
Distributed under Apache License, Version 2.0.

aggregations-and-analytics/01/01_group_by_window.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to aggregate time-series data in real-time using a `TUMBLE` window.
44
5-
The source table (`server_logs`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`server_logs`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

77
Many streaming applications work with time-series data.
88
To count the number of `DISTINCT` IP addresses seen each minute, rows need to be grouped based on a [time attribute](https://docs.ververica.com/user_guide/sql_development/table_view.html#time-attributes).

aggregations-and-analytics/02/02_watermarks.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
> :bulb: This example will show how to use `WATERMARK`s to work with timestamps in records.
66
7-
The source table (`doctor_sightings`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
7+
The source table (`doctor_sightings`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
88

99
The [previous recipe](../01/01_group_by_window.md) showed how a `TUMBLE` group window makes it simple to aggregate time-series data.
1010

@@ -13,7 +13,7 @@ As different versions of the Doctor travel through time, various people log thei
1313
We want to track how many times each version of the Doctor is seen each minute.
1414
Unlike the previous recipe, these records have an embedded timestamp we need to use to perform our calculation.
1515

16-
More often than not, most data will come with embedded timestamps that we want to use for our time series calculations. We call this timestamp an [event-time attribute](https://ci.apache.org/projects/flink/flink-docs-stable/learn-flink/streaming_analytics.html#event-time-and-watermarks).
16+
More often than not, most data will come with embedded timestamps that we want to use for our time series calculations. We call this timestamp an [event-time attribute](https://ci.apache.org/projects/flink/flink-docs-stable/docs/learn-flink/streaming_analytics/#event-time-and-watermarks).
1717

1818
Event time represents when something actually happened in the real world.
1919
And it is unique because it is quasi-monotonically increasing; we generally see things that happened earlier before seeing things that happen later. Of course, data will never be perfectly ordered (systems go down, networks are laggy, doctor sighting take time to postmark and mail), and there will be some out-of-orderness in our data.

aggregations-and-analytics/03/03_group_by_session_window.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to aggregate time-series data in real-time using a `SESSION` window.
44
5-
The source table (`server_logs`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`server_logs`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

77
#### What are Session Windows?
88

aggregations-and-analytics/04/04_over.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to calculate an aggregate or cumulative value based on a group of rows using an `OVER` window. A typical use case are rolling aggregations.
44
5-
The source table (`temperature_measurements`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`temperature_measurements`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

77
OVER window aggregates compute an aggregated value for every input row over a range of ordered rows.
88
In contrast to GROUP BY aggregates, OVER aggregates do not reduce the number of result rows to a single row for every group.

aggregations-and-analytics/05/05_top_n.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
> :bulb: This example will show how to continuously calculate the "Top-N" rows based on a given attribute, using an `OVER` window and the `ROW_NUMBER()` function.
66
7-
The source table (`spells_cast`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
7+
The source table (`spells_cast`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
88

99
The Ministry of Magic tracks every spell a wizard casts throughout Great Britain and wants to know every wizard's Top 2 all-time favorite spells.
1010

aggregations-and-analytics/07/07_chained_windows.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to efficiently aggregate time series data on two different levels of granularity.
44
5-
The source table (`server_logs`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`server_logs`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

77
Based on our `server_logs` table we would like to compute the average request size over one minute **as well as five minute (event) windows.**
88
For this, you could run two queries, similar to the one in [Aggregating Time Series Data](../01/01_group_by_window.md).
@@ -11,7 +11,7 @@ At the end of the page is the script and resulting JobGraph from this approach.
1111
In the main part, we will follow a slightly more efficient approach that chains the two aggregations: the one-minute aggregation output serves as the five-minute aggregation input.
1212

1313
We then use a [Statements Set](../../foundations/08/08_statement_sets.md) to write out the two result tables.
14-
To keep this example self-contained, we use two tables of type `blackhole` instead of `kafka`, `filesystem`, or any other [connectors](https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/).
14+
To keep this example self-contained, we use two tables of type `blackhole` instead of `kafka`, `filesystem`, or any other [connectors](https://ci.apache.org/projects/flink/flink-docs-stable/docs/connectors/table/overview/).
1515

1616
## Script
1717

aggregations-and-analytics/08/08_match_recognize.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ BASIC AS BASIC.type = 'basic');
4141

4242
## Script
4343

44-
The source table (`subscriptions`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
44+
The source table (`subscriptions`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
4545

4646
```sql
4747
CREATE TABLE subscriptions (

foundations/04/04_where.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to filter server logs in real-time using a standard `WHERE` clause.
44
5-
The table it uses, `server_logs`, is backed by the [`faker` connector](https://github.com/knaufk/flink-faker) which continuously generates rows in memory based on Java Faker expressions and is convenient for testing queries.
5+
The table it uses, `server_logs`, is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker) which continuously generates rows in memory based on Java Faker expressions and is convenient for testing queries.
66
As such, it is an alternative to the built-in `datagen` connector used for example in [the first recipe](../01/01_create_table.md).
77

88
You can continuously filter these logs for those requests that experience authx issues with a simple `SELECT` statement with a `WHERE` clause filtering on the auth related HTTP status codes.

foundations/05/05_group_by.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
> :bulb: This example will show how to aggregate server logs in real-time using the standard `GROUP BY` clause.
44
5-
The source table (`server_logs`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`server_logs`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

7-
To count the number of logs received per browser for each status code _over time_, you can combine the `COUNT` aggregate function with a `GROUP BY` clause. Because the `user_agent` field contains a lot of information, you can extract the browser using the built-in [string function](https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#string-functions) `REGEXP_EXTRACT`.
7+
To count the number of logs received per browser for each status code _over time_, you can combine the `COUNT` aggregate function with a `GROUP BY` clause. Because the `user_agent` field contains a lot of information, you can extract the browser using the built-in [string function](https://ci.apache.org/projects/flink/flink-docs-stable/docs/dev/table/functions/systemfunctions/#string-functions) `REGEXP_EXTRACT`.
88

99
A `GROUP BY` on a streaming table produces an updating result, so you will see the aggregated count for each browser continuously changing as new rows flow in.
1010

joins/03/03_kafka_join.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ ON t.currency_code = c.currency_code;
6767
<summary>Data Generators</summary>
6868

6969
The two topics are populated using a Flink SQL job, too.
70-
We use the [`faker` connector](https://github.com/knaufk/flink-faker) to generate rows in memory based on Java Faker expressions and write those to the respective Kafka topics.
70+
We use the [`faker` connector](https://flink-packages.org/packages/flink-faker) to generate rows in memory based on Java Faker expressions and write those to the respective Kafka topics.
7171

7272
### ``currency_rates`` Topic
7373

joins/04/04_lookup_joins.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ In this example, you will look up reference user data stored in MySQL to flag su
1414

1515
## Script
1616

17-
The source table (`subscriptions`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions. The `users` table is backed by an existing MySQL reference table using the [JDBC connector](https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/jdbc.html).
17+
The source table (`subscriptions`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions. The `users` table is backed by an existing MySQL reference table using the [JDBC connector](https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/jdbc.html).
1818

1919
```sql
2020
CREATE TABLE subscriptions (

joins/05/05_star_schema.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ ON t.destination_station_key = ds.station_key
112112
<summary>Data Generators</summary>
113113

114114
The four topics are populated with Flink SQL jobs, too.
115-
We use the [`faker` connector](https://github.com/knaufk/flink-faker) to generate rows in memory based on Java Faker expressions and write those to the respective Kafka topics.
115+
We use the [`faker` connector](https://flink-packages.org/packages/flink-faker) to generate rows in memory based on Java Faker expressions and write those to the respective Kafka topics.
116116

117117
### ``train_activities`` Topic
118118

other-builtin-functions/01/01_date_time.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how to use [built-in date and time functions](https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/functions/systemFunctions.html#temporal-functions) to manipulate temporal fields.
44
5-
The source table (`subscriptions`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
5+
The source table (`subscriptions`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
66

77
#### Date and Time Functions
88

@@ -22,7 +22,7 @@ Assume you have a table with service subscriptions and that you want to continuo
2222

2323
* `CURRENT_TIMESTAMP`: returns the current SQL timestamp (UTC)
2424

25-
For a complete list of built-in date and time functions, check the Flink [documentation](https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/functions/systemFunctions.html#temporal-functions).
25+
For a complete list of built-in date and time functions, check the Flink [documentation](https://ci.apache.org/projects/flink/flink-docs-stable/docs/dev/table/functions/systemfunctions/#temporal-functions).
2626

2727
> As an exercise, you can try to reproduce the same filtering condition using `TIMESTAMPADD` instead.
2828

other-builtin-functions/02/02_union-all.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> :bulb: This example will show how you can use the set operation `UNION ALL` to combine several streams of data.
44
5-
See [our documentation](https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/queries/#set-operations)
5+
See [our documentation](https://ci.apache.org/projects/flink/flink-docs-stable/docs/dev/table/sql/queries/set-ops/)
66
for a full list of fantastic set operations Apache Flink supports.
77

88

@@ -12,7 +12,7 @@ The examples assumes you are building an application that is tracking visits :fo
1212
There are three sources of visits. The universe of Rick and Morty, the very real world of NASA and such,
1313
and the not so real world of Hitchhikers Guide To The Galaxy.
1414

15-
All three tables are `unbounded` and backed by the [`faker` connector](https://github.com/knaufk/flink-faker).
15+
All three tables are `unbounded` and backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker).
1616

1717
All sources of tracked visits have the `location` and `visit_time` in common. Some have `visitors`, some have
1818
`spacecrafts` and one has both.

udfs/01/01_python_udfs.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ For detailed instructions on how to then make the Python file available as a UDF
3737

3838
#### SQL
3939

40-
The source table (`temperature_measurements`) is backed by the [`faker` connector](https://github.com/knaufk/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
40+
The source table (`temperature_measurements`) is backed by the [`faker` connector](https://flink-packages.org/packages/flink-faker), which continuously generates rows in memory based on Java Faker expressions.
4141

4242
```sql
4343
--Register the Python UDF using the fully qualified

0 commit comments

Comments
 (0)