Uploaded image for project: 'CloverDX'
  1. CloverDX
  2. CLO-397

Disable component as Trash

    XMLWordPrintable

    Details

    • Branch:
    • QA Testing:
      Graph automated test
    • QA Test Identification:
      DisableAsTrash_*
    • Additional information:
      Hide

      Motivation: During graph development it's often useful to temporarily disable some components and run the graph to check whether other components are configured correctly. Specific case can be e.g. disabling writers and thus putting the graph in a dry-run mode. Temporarily disabling components used to be bothersome because side-effects of disabling a component made the graph invalid. For example disabling a DBOutputTable meant this:

      1. Disable DBOutputTable
      2. Disable all components behind DBOutputTable, if any
      3. Add Trash component
      4. Reconnect edge from DBOutputTable to Trash

      Then the graph was valid again. When you wanted to go back to original state you had to revert all the steps.

      With new Disable as Trash action you only have to do one thing:

      1. Disable DBOutputTable as Trash

      When you're done debugging just enable the component again and you're right back where you were before.

      What happens with component disabled as Trash?

      • The component will be replaced by a special "trashifier" component that will read all incoming records and discard them.
      • The "trashifier" component will have the same ID as the original component. You will be able to see it in a graph log.

      Did you know?

      • Setting component to Disabled as Trash will never make graph invalid.
      • Disable as Trash does not affect metadata propagation. For example Subgraph component disabled as trash will still propagate metadata defined in the subgraph. That means your transformations in other components will stay valid. Try it yourself in attached project.
      • Components disabled as Trash don't need to be configured correctly. Missing attributes or wrong mapping? No problem.
      • Disable as Trash can save you lots of time when you need to disable large parts of graphs. As long as the components are connected by edges the component disabled as Trash will "block" the whole sub-tree. Try disabling first component in a component chain - following components will be disabled. State of the subsequent components doesn't change though - they will be enabled again when the first component will stop "blocking" them.

      One example for all:
      Imagine you want to edit your transformations in Reformat and HashJoin in this graph. You want to see the debug data on their output edges but don't want the graph to perform writes since the transformations aren't correct yet.

      You can disable the subgraph and DBOutputTable as Trash:

      Now you can edit the transformations and run the graph repeatedly without worrying about writes since the graph is basically in a dry-run mode. Nice thing is that the edge leading into the subgraph still has metadata from the subgraph. If you used the old method of reconnecting the edge to a regular Trash component, you would lose the metadata which would make your transformation in the Reformat invalid.

      Project with the example is available in this archive: trashify_highlight.zip

      Show
      Motivation: During graph development it's often useful to temporarily disable some components and run the graph to check whether other components are configured correctly. Specific case can be e.g. disabling writers and thus putting the graph in a dry-run mode. Temporarily disabling components used to be bothersome because side-effects of disabling a component made the graph invalid. For example disabling a DBOutputTable meant this: Disable DBOutputTable Disable all components behind DBOutputTable, if any Add Trash component Reconnect edge from DBOutputTable to Trash Then the graph was valid again. When you wanted to go back to original state you had to revert all the steps. With new Disable as Trash action you only have to do one thing: Disable DBOutputTable as Trash When you're done debugging just enable the component again and you're right back where you were before. What happens with component disabled as Trash? The component will be replaced by a special "trashifier" component that will read all incoming records and discard them. The "trashifier" component will have the same ID as the original component. You will be able to see it in a graph log. Did you know? Setting component to Disabled as Trash will never make graph invalid. Disable as Trash does not affect metadata propagation. For example Subgraph component disabled as trash will still propagate metadata defined in the subgraph. That means your transformations in other components will stay valid. Try it yourself in attached project. Components disabled as Trash don't need to be configured correctly. Missing attributes or wrong mapping? No problem. Disable as Trash can save you lots of time when you need to disable large parts of graphs. As long as the components are connected by edges the component disabled as Trash will "block" the whole sub-tree. Try disabling first component in a component chain - following components will be disabled. State of the subsequent components doesn't change though - they will be enabled again when the first component will stop "blocking" them. One example for all: Imagine you want to edit your transformations in Reformat and HashJoin in this graph. You want to see the debug data on their output edges but don't want the graph to perform writes since the transformations aren't correct yet. You can disable the subgraph and DBOutputTable as Trash: Now you can edit the transformations and run the graph repeatedly without worrying about writes since the graph is basically in a dry-run mode. Nice thing is that the edge leading into the subgraph still has metadata from the subgraph. If you used the old method of reconnecting the edge to a regular Trash component, you would lose the metadata which would make your transformation in the Reformat invalid. Project with the example is available in this archive: trashify_highlight.zip

      Description

      If a writer which has connected input port is disabled in a graph, the graph becomes invalid. This is of course quite reasonable behaviour since the data does not have anywhere to flow, but it is bothersome during graph development when various writers may not be needed during debugging/testing.

      Therefore I think that it would be better if disabled writer did not make the graph invalid, but rather it should throw away all incoming data (i.e. act as a trash). Of course it should act as a trash in "Validate records" mode to make sure the data can still be properly deserialised and will be accepted by the writer. Of course since the writer knows various information about the data and output target, it can perform additional validation, but should not write anything to the output.

      This would simplify following "pattern"

      1. Disable writer
      2. Disconnect writer
      3. Connect trash to the edge instead of the writer
      4. ... (work on the graph) ...
      5. Delete the trash
      6. Reconnect the writer
      7. Enable the writer


      to a very simple process:

      1. Disable writer
      2. ... (work on the graph) ...
      3. Enable writer

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              salamonp Pavel Salamon
              Reporter:
              repcekb Branislav Repcek
              Votes:
              8 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: