upgrading cluster from 1.0.0 to 1.3.0

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

upgrading cluster from 1.0.0 to 1.3.0

Jeremy Farbota
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110
Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Neil Derraugh

On Thu, Jun 15, 2017 at 3:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Jeremy Farbota
Thanks, after I fired this email I saw that. We'll try a couple different things. I will assume this will not work and will switch the consumers using a route with partition-offsets.

If there is any other information, please let me know.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 4:16 PM, Neil Derraugh <[hidden email]> wrote:

On Thu, Jun 15, 2017 at 3:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110


Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Jeremy Farbota
In reply to this post by Jeremy Farbota
We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Andy LoPresto-2
Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank" class="">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank" class="">(217) 898-8110



signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Andy LoPresto-2
I believe I missed a crucial file — if you are using the default file-based authorizer provided by NiFi, you will also want to copy conf/authorized-users.xml which defines the various users and their access control policies. Sorry about that, the instance I referenced when typing wasn’t using it so I forgot. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 5:17 PM, Andy LoPresto <[hidden email]> wrote:

Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank" class="">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank" class="">(217) 898-8110




signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Jeremy Farbota
Andy,

Thanks a ton. Huge help. We will report back.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 29, 2017 at 5:20 PM, Andy LoPresto <[hidden email]> wrote:
I believe I missed a crucial file — if you are using the default file-based authorizer provided by NiFi, you will also want to copy conf/authorized-users.xml which defines the various users and their access control policies. Sorry about that, the instance I referenced when typing wasn’t using it so I forgot. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 5:17 PM, Andy LoPresto <[hidden email]> wrote:

Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110




Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Matt Gilman
Thanks for the great summarization Andy. Just wanted to clarify one small detail. The default file-based authorizer will use conf/authorizations.xml though this value is configurable in conf/authorizers.xml. In the 0.x, when NiFi used role-based authorization the file was called conf/authorized-users.xml.

Matt

On Thu, Jun 29, 2017 at 11:28 PM, Jeremy Farbota <[hidden email]> wrote:
Andy,

Thanks a ton. Huge help. We will report back.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 29, 2017 at 5:20 PM, Andy LoPresto <[hidden email]> wrote:
I believe I missed a crucial file — if you are using the default file-based authorizer provided by NiFi, you will also want to copy conf/authorized-users.xml which defines the various users and their access control policies. Sorry about that, the instance I referenced when typing wasn’t using it so I forgot. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 5:17 PM, Andy LoPresto <[hidden email]> wrote:

Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110





Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Jeremy Farbota
I wanted to report back that our upgrade is complete on dev and prod!

We pretty much followed the advice here and everything went fine.

We stopped everything on the old nodes and cleared out the content repos before turning on the new nodes. We let them join and followed their logs. Once they were joined, we turned off the old nodes. Once the new primary node was elected on a new node, we were good to login to the new UI with our previous flow. Everything came back up without a hitch. I removed the old nodes from the cluster from the UI carried on as usual.

I'm already getting started with moving my consumers over to using record logic. It's great stuff!

I've turned a lot of skeptics on our engineering team into NiFi fans over the last year as we've leveraged it more and more for scheduling, batch processes, and especially kafka consumption and streaming etl.

Thanks again!



Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Fri, Jun 30, 2017 at 6:04 AM, Matt Gilman <[hidden email]> wrote:
Thanks for the great summarization Andy. Just wanted to clarify one small detail. The default file-based authorizer will use conf/authorizations.xml though this value is configurable in conf/authorizers.xml. In the 0.x, when NiFi used role-based authorization the file was called conf/authorized-users.xml.

Matt

On Thu, Jun 29, 2017 at 11:28 PM, Jeremy Farbota <[hidden email]> wrote:
Andy,

Thanks a ton. Huge help. We will report back.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 29, 2017 at 5:20 PM, Andy LoPresto <[hidden email]> wrote:
I believe I missed a crucial file — if you are using the default file-based authorizer provided by NiFi, you will also want to copy conf/authorized-users.xml which defines the various users and their access control policies. Sorry about that, the instance I referenced when typing wasn’t using it so I forgot. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 5:17 PM, Andy LoPresto <[hidden email]> wrote:

Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110






Reply | Threaded
Open this post in threaded view
|

Re: upgrading cluster from 1.0.0 to 1.3.0

Joe Witt
thanks for reporting back Jeremy and glad you had a smooth transition.  As you further explore the record stuff please do share your findings.  They make a really massive difference for performance and configurability of common use cases so we really want to make sure keep getting them as close to an ideal user experience as we can.

Thanks
Joe

On Thu, Jul 6, 2017 at 7:35 PM, Jeremy Farbota <[hidden email]> wrote:
I wanted to report back that our upgrade is complete on dev and prod!

We pretty much followed the advice here and everything went fine.

We stopped everything on the old nodes and cleared out the content repos before turning on the new nodes. We let them join and followed their logs. Once they were joined, we turned off the old nodes. Once the new primary node was elected on a new node, we were good to login to the new UI with our previous flow. Everything came back up without a hitch. I removed the old nodes from the cluster from the UI carried on as usual.

I'm already getting started with moving my consumers over to using record logic. It's great stuff!

I've turned a lot of skeptics on our engineering team into NiFi fans over the last year as we've leveraged it more and more for scheduling, batch processes, and especially kafka consumption and streaming etl.

Thanks again!



Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Fri, Jun 30, 2017 at 6:04 AM, Matt Gilman <[hidden email]> wrote:
Thanks for the great summarization Andy. Just wanted to clarify one small detail. The default file-based authorizer will use conf/authorizations.xml though this value is configurable in conf/authorizers.xml. In the 0.x, when NiFi used role-based authorization the file was called conf/authorized-users.xml.

Matt

On Thu, Jun 29, 2017 at 11:28 PM, Jeremy Farbota <[hidden email]> wrote:
Andy,

Thanks a ton. Huge help. We will report back.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 29, 2017 at 5:20 PM, Andy LoPresto <[hidden email]> wrote:
I believe I missed a crucial file — if you are using the default file-based authorizer provided by NiFi, you will also want to copy conf/authorized-users.xml which defines the various users and their access control policies. Sorry about that, the instance I referenced when typing wasn’t using it so I forgot. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 5:17 PM, Andy LoPresto <[hidden email]> wrote:

Jeremy,

Sorry to hear you are having difficulties upgrading. I’ll try to answer your questions as best I can, but welcome others joining the thread. 

It sounds like this is less of an in-place upgrade and more of a cluster migration since you are installing the software on new nodes and moving the flow there. If you do not also move the various repositories, you effectively have a new fresh install and have just copied the flow definition. 

The Apache NiFi wiki does provide Migration Guidance [1] and this document is updated with best practices/standard operating procedure with each release. It is prepared by the developers who have tested the upgrade process, and hopefully is helpful to end users doing the same. Perhaps even more relevant is the Upgrade Guide [2], which has step-by-step instructions for configuring an instance to be easily upgradable, and how to perform the upgrades in-place quickly and repeatably. 

To “upgrade” to a new node and maintain the existing functionality and configurations, you would want to copy additional files:

* conf/authorizers.xml — contains the user/group providers and access policy providers (separated during 1.3.0) which control permissions in the application
* conf/bootstrap.conf — contains the master key for encryption and other NiFi runtime arguments
* conf/logback.xml — defines log levels
* conf/login-identity-providers.xml — defines LDAP or Kerberos connection details
* conf/nifi.properties — global property definition for the application
* conf/state-management.xml — defines state management configuration (especially important for a cluster)
* conf/zookeeper.properties — defines ZK configurations and addresses (especially important for a cluster)

[Optional]

* flowfile_repository/ — copy this to allow flowfile replay and history
* content_repository/ — copy this to allow flowfile replay and history
* provenance_repository/ — copy this to maintain the history of the provenance records

I’m not quite sure what you mean by “keeping the state of my consumers” — is this specifically a ConsumeKafka processor or just source processors in general? If you upgrade in-place or copy the various configs and repositories mentioned above, when the app starts again, the component state should be as you left it (barring code changes to the component in the new release). 

Hopefully those instructions and the documents linked below will help you. We definitely want to make upgrading a seamless experience because we are constantly working to bring new features and optimizations into the app, and supporting legacy versions is an added cost. We are not looking for version fragmentation here. Any suggestions you have for improving the upgrade process and our communication around those instructions is welcome. Thanks. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jun 29, 2017, at 4:15 PM, Jeremy Farbota <[hidden email]> wrote:

We are attempting to upgrade and we're having issues:

We created new nodes with the new version (1.3.0). We stopped all flowed and ensured nothing is in queue. We copied the flow.xml.gz file to the new node. 

Are following the right protocol? Should we be copying the users.xml file as well? 

Is there a way to migrate to the new version while keeping the state of my consumers?

It seems like I have to build a cluster and recreate the user permissions and everything from scratch. Is there a workaround?

Is there a wiki or any notes about how to upgrade a cluster?

Last time I posted this there was a reply with a link to a thread that does not provide any additional info about how to bring the cluster up with the new version.

Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110

On Thu, Jun 15, 2017 at 12:23 PM, Jeremy Farbota <[hidden email]> wrote:
We're preparing to upgrade our cluster from 1.0.0 to 1.3.0.

We're using external zookeeper.

We're wondering if we can simply expand the cluster with 1.3.0 machines and then turn off the other ones to keep processes running. Are there issues with attempting that? Is there an upgrade guide someone for clusters on Wiki?

We have kafka consumers and other maintenance processes that are running in production so we'd like to make the change without messing with the state of those consumers if possible.

Kindly,


Payoff, Inc.
Jeremy Farbota
Software Engineer, Data
Payoff, Inc.

[hidden email]
<a href="tel:+2178988110" style="color:rgb(242,19,123)" target="_blank">(217) 898-8110