Skip to content

[Merged by Bors] - Docs: New index page #460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/modules/nifi/images/.$nifi_overview.drawio.svg.bkp

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions docs/modules/nifi/images/nifi_overview.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
31 changes: 0 additions & 31 deletions docs/modules/nifi/pages/commandline_args.adoc

This file was deleted.

91 changes: 88 additions & 3 deletions docs/modules/nifi/pages/configuration.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,92 @@
= Configuration

include::commandline_args.adoc[]
== Command Line Parameters

include::partial$env_var_args.adoc[]
This operator accepts the following command line parameters:

include::partial$config_properties.adoc[]
=== product-config

*Default value*: `/etc/stackable/nifi-operator/config-spec/properties.yaml`

*Required*: false

*Multiple values:* false

[source]
----
stackable-nifi-operator run --product-config /foo/bar/properties.yaml
----

=== watch-namespace

*Default value*: All namespaces

*Required*: false

*Multiple values:* false

The operator will **only** watch for resources in the provided namespace `test`:

[source]
----
stackable-nifi-operator run --watch-namespace test
----

== Environment variables

This operator accepts the following environment variables:

=== PRODUCT_CONFIG

*Default value*: `/etc/stackable/nifi-operator/config-spec/properties.yaml`

*Required*: false

*Multiple values:* false

[source]
----
export PRODUCT_CONFIG=/foo/bar/properties.yaml
stackable-nifi-operator run
----

or via docker:

----
docker run \
--name nifi-operator \
--network host \
--env KUBECONFIG=/home/stackable/.kube/config \
--env PRODUCT_CONFIG=/my/product/config.yaml \
--mount type=bind,source="$HOME/.kube/config",target="/home/stackable/.kube/config" \
docker.stackable.tech/stackable/nifi-operator:latest
----

=== WATCH_NAMESPACE

*Default value*: All namespaces

*Required*: false

*Multiple values:* false

The operator will **only** watch for resources in the provided namespace `test`:

[source]
----
export WATCH_NAMESPACE=test
stackable-nifi-operator run
----

or via docker:

[source]
----
docker run \
--name nifi-operator \
--network host \
--env KUBECONFIG=/home/stackable/.kube/config \
--env WATCH_NAMESPACE=test \
--mount type=bind,source="$HOME/.kube/config",target="/home/stackable/.kube/config" \
docker.stackable.tech/stackable/nifi-operator:latest
----
16 changes: 0 additions & 16 deletions docs/modules/nifi/pages/dependencies.adoc

This file was deleted.

27 changes: 25 additions & 2 deletions docs/modules/nifi/pages/index.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,31 @@
= Stackable Operator for Apache NiFi
:description: The Stackable Operator for Apache NiFi is a Kubernetes operator that can manage Apache NiFi clusters. Learn about its features, resources, dependencies and demos, and see the list of supported NiFi versions.
:keywords: k8s, Kubernetes, Stackable Operator, Apache NiFi, open source, operator, data science, data exploration, big data

This is an operator for Kubernetes that can manage https://nifi.apache.org/[Apache NiFi] clusters.
This Operator manages https://nifi.apache.org/[Apache NiFi] clusters on Kubernetes.
Apache NiFi is an open-source data integration tool that provides a web-based interface for designing, monitoring and managing data flows between various systems and devices, using a visual programming approach. It supports a wide range of data sources, formats and features such as data provenance, security and clustering.

WARNING: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fnifi[Stackable] repository
== Getting started

Get started with Apache NiFi and the Stackable Operator by following the xref:getting_started/index.adoc[] guide. It will guide you through the xref:getting_started/installation.adoc[installation] process and xref:getting_started/first_steps.adoc[connect] to the NiFi web interface. Afterwards have a look at the xref:usage_guide/index.adoc[] to learn how to configure your NiFi instance to your needs or run some <<demos, demos>> to learn more about using NiFi with other components.

== Operator Model

The Operator manages the _NifiCluster_ custom resource. NiFi only has a single process that it needs to run, so the NifiCluster has only a single xref:concepts:roles-and-role-groups.adoc[role]: `node`. This role can be divided in multiple role groups.

image::nifi_overview.drawio.svg[A diagram depicting the Kubernetes resources created by the Stackable Operator for Apache NiFi]

For every role group the Operator creates a ConfigMap and StatefulSet which can have multiple replicas (Pods). Every role group is accessible through it's own Service, and there is a Service for the whole Cluster.

== Dependencies

Apache NiFi depends on Apache ZooKeeper which you can run in Kubernetes with the xref:zookeeper:index.adoc[].

== [[demos]]Demos

NiFi is often a good choice as a first step in a data pipeline when it comes to fetching the data in various formats from various sources. The xref:stackablectl::demos/data-lakehouse-iceberg-trino-spark.adoc[] demo uses NiFi to fetch six different datasets in various formats. The data is then ingested into a Kafka topic. Apache Kafka is also xref:kafka:index.adoc[part of the Stackable platform].

The xref:stackablectl::demos/nifi-kafka-druid-earthquake-data.adoc[] and xref:stackablectl::demos/nifi-kafka-druid-water-level-data.adoc[] demo use NiFi in the same way, both demos showcase downloading data from web APIs and ingesting it into Kafka.

== Supported Versions

Expand Down
51 changes: 50 additions & 1 deletion docs/modules/nifi/pages/usage_guide/index.adoc
Original file line number Diff line number Diff line change
@@ -1,3 +1,52 @@
= Usage guide

This section will help you to use various aspects of the Stackable Operator for Apache NiFi. For a general introduction into the operator follow the xref:getting_started/index.adoc[] guide.
This section will help you to use various aspects of the Stackable Operator for Apache NiFi. For a general introduction into the operator follow the xref:getting_started/index.adoc[] guide. Below is a general overview of some configuration aspects, have a look at the sub pages for details.

The cluster is configured via a YAML manifest file. This custom resource specifies the amount of replicas for each role group or role specific configuration like resource requests.
The following listing shows an example configuration:

[source,yaml]
----
apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
name: simple-nifi
spec:
image:
productVersion: 1.18.0
stackableVersion: "23.4.0"
clusterConfig:
zookeeperConfigMapName: simple-nifi-znode # <1>
authentication: # <2>
method:
SingleUser:
adminCredentialsSecret:
name: nifi-admin-credentials-simple
namespace: default
allowAnonymousAccess: true
extraVolumes: # <3>
- name: nifi-client-certs
secret:
secretName: nifi-client-certs
sensitiveProperties:
keySecret: nifi-sensitive-property-key
autoGenerate: true
nodes:
roleGroups:
default:
config:
resources: # <4>
cpu:
min: "500m"
max: "4"
memory:
limit: '2Gi'
replicas: 3
----

<1>: The xref:usage_guide/zookeeper-connection.adoc[ZooKeeper instance] to use.
<2>: How users should xref:usage_guide/security.adoc[authenticate] themselves.
<3>: xref:usage_guide/extra-volumes.adoc[Extra volumes] with files that can be referenced in custom workflows.
<4>: xref:usage_guide/resource-configuration.adoc[CPU and memory configuration] can be set per role group.

Not shown are the common settings for xref:usage_guide/cluster-operations.adoc[starting and stopping the cluster] and xref:usage_guide/pod-placement.adoc[distributing Pods]. Additionally you can set any NiFi setting using xref:usage_guide/configuration-environment-overrides.adoc[overrides]. You can also configure xref:usage_guide/log-aggregation.adoc[log aggregation].
11 changes: 9 additions & 2 deletions docs/modules/nifi/pages/usage_guide/security.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@

Every user has to authenticate themselves before using NiFI.
There are multiple options to set up the authentication of users.
All authentication related parameters are configured under `spec.clusterConfig.authentication`.

=== Single user

The default setting is to only provision a single user with administrative privileges.
You need to specify the username and password of the user.
Currently, the only supported authentication method is "SingleUser", which allows the definition of one admin user which can then access the cluster.
Specification of these users credentials happens via referring to a Secret in Kubernetes, this secret will need to contain at least the two keys `username` and `password`.
Extra keys may be present, but will be ignored by the operator.

[source,yaml]
----
Expand Down Expand Up @@ -36,6 +38,11 @@ spec:

Additional users can not be added.

==== Anonymous Access

NiFi can be configured to allow anonymous access to the web UI, this is turned off by default, but can be enabled via the parameter `allowAnonymousAccess`.
This setting is independent of the configured authentication method and will override anything specified for the authentication provider.

[#authentication-ldap]
=== LDAP

Expand Down
12 changes: 12 additions & 0 deletions docs/modules/nifi/pages/usage_guide/zookeeper-connection.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
= Connecting NiFi to ZooKeeper

NiFi in cluster mode requires a ZooKeeper ensemble for state management and leader election purposes, this operator at the moment does not support single node deployments without ZooKeeper, hence this is a required setting.

[source,yaml]
----
spec:
clusterConfig:
zookeeperConfigMapName: simple-nifi-znode
----

Configuration happens via a ConfigMap, which needs to contain two keys called `ZOOKEEPER_HOSTS` with the value being the ZooKeeper connection string and `ZOOKEEPER_CHROOT` with the value being the ZooKeeper chroot. This ConfigMap typically is created by a ZookeeperZnode of the xref:zookeeper:index.adoc[ZooKeeper Operator].
106 changes: 0 additions & 106 deletions docs/modules/nifi/partials/config_properties.adoc

This file was deleted.

Loading