Quantcast
Channel: Entreprise content management – Blog dbi services
Viewing all 167 articles
Browse latest View live

What to do in case all active Documentum jobs are no more running ?

$
0
0

The application support informed me that their jobs are not running anymore. When I started the analysis, I found that all activated jobs did not start for a few weeks.

First of all, I decided to work on a specific job which is not one from application team but where I know that I can start it several times without impacting the business.
Do you know which one? dm_ContentWarning

I checked the job attributes like start_date, expiration_date, is_inactive, target_server (as we have several Content Server to cover the high availability), a_last_invocation, a_next_invocation and of course the a_current_status.
Once this first check was done, with the DA I started the job (selected run now and saved the job).

  object_name                : dm_ContentWarning
  start_date                 : 5/30/2017 20:00:00
  expiration_date            : 5/30/2025 20:00:00
  max_iterations             : 0
  run_interval               : 1
  run_mode                   : 3
  is_inactive                : F
  inactivate_after_failure   : F
  target_server              : Docbase1.Docbase1@vmcs1.dbi-services.com
  a_last_invocation          : 9/20/2018 19:05:29
  a_last_completion          : 9/20/2018 19:07:00
  a_current_status           : ContentWarning Tool Completed at
                        9/20/2018 19:06:50.  Total duration was
                        1 minutes.
  a_next_invocation          : 9/21/2018 19:05:00

Few minutes later, I checked again the result and the different attributes, not all attributes like before but only a_last_completion and a_next_invocation and of course the content of the job log file. The job ran as expected when I forced the job to run.

  a_last_completion          : 10/31/2018 10:41:25
  a_current_status           : ContentWarning Tool Completed at
                        10/31/2018 10:41:14.  Total duration
                        was 2 minutes.
  a_next_invocation          : 10/31/2018 19:05:00
[dmadmin@vmcs1 agentexec]$ more job_0801234380000359
Wed Oct 31 10:39:54 2018 [INFORMATION] [LAUNCHER 12071] Detected while preparing job dm_ContentWarning for execution: Agent Exec
connected to server Docbase1:  [DM_SESSION_I_SESSION_START]info:  "Session 01012343807badd5 started for user dmadmin."
...
...

Ok the job ran and the a_next_invocation was set accordingly to run_interval and run_mode in our case once a day. (I thought), I found the reason of the issue: the repository was stopped for a few days and therefore, when restarted, the a_next_invocation date was in the past (a_next_invocation: 9/21/2018 19:05:00). So I decided to see the result the day after once the job ran based on the defined schedule (a_next_invocation: 10/31/2018 19:05:00).

The next day… the job did not run. Strange!
I decided to think a bit deeper ;-). Do something else to go a step further and set the a_next_invocation date to run the job in 5 minutes.

update dm_job objects set a_next_invocation = date('01.11.2018 11:53:00','dd.mm.yyyy hh:mi:ss') where object_name = 'dm_ContentWarning';
1

select r_object_id, object_name, a_next_invocation from dm_job where object_name = 'dm_ContentWarning';
0801234380000359	dm_ContentWarning	11/01/2018 11:53:00

Result, the job did not start. 🙁 Hmmm, why ?

Before continuing to work on the job, I did some other checks, like analyzing the log files, repository, agent_exec, sysadmin etc.
I found that the DB was down a few days before and decided to restart the repository, set the a_next_invocation again but unfortunately this did not help.

To be sure it’s not related to the full installation, I ran, successfully, a distributed job (the dm_contentWarningvmcs2_Docbase1) on the second Content Server. This meant the issue is only located on my first Content Server.

Searching in the OpenText knowledge base (KB9264366, KB8716186 and KB6327280), none of them gave me the solution.

I knew, even if I did not used often it in my last 20 years in the Documentum world, that we can trace the agent_exec so let’s see this point:

  1. add for the dm_agent_method the parameter -trace_level 1
  2. reinit the server
  3. kill the dm_agent_exec process related to Docbase1, the process will be started automatically after few minutes.
[dmadmin@vmcs1 agentexec]$ ps -ef | grep agent | grep Docbase1
dmadmin  27312 26944  0 Oct31 ?        00:00:49 ./dm_agent_exec -enable_ha_setup 1 -docbase_name dmadmin  27312 26944  0 Oct31 ?        00:00:49 ./dm_agent_exec -enable_ha_setup 1 -docbase_name Docbase1.Docbase1 -docbase_owner dmadmin -sleep_duration 0
[dmadmin@vmcs1 agentexec]$ kill -9 27312
[dmadmin@vmcs1 agentexec]$ ps -ef | grep agent | grep Docbase1
[dmadmin@vmcs1 agentexec]$
[dmadmin@vmcs1 agentexec]$ ps -ef | grep agent | grep Docbase1
dmadmin  15440 26944 57 07:48 ?        00:00:06 ./dm_agent_exec -enable_ha_setup 1 -trace_level 1 -docbase_name Docbase1.Docbase1 -docbase_owner dmadmin -sleep_duration 0
[dmadmin@vmcs1 agentexec]$

I changed again the a_next_invocation and check the agent_exec log file where the executed queries have been recorded.
Two recorded queries seemed to be important:

SELECT count(r_object_id) as cnt FROM dm_job WHERE ( (run_now = 1) OR ((is_inactive = 0) AND ( ( a_next_invocation <= DATE('now') AND a_next_invocation IS NOT NULLDATE ) OR ( a_next_continuation  DATE('now')) OR (expiration_date IS NULLDATE)) AND ((max_iterations = 0) OR (a_iterations < max_iterations))) ) AND (i_is_reference = 0 OR i_is_reference is NULL) AND (i_is_replica = 0 OR i_is_replica is NULL) AND UPPER(target_server) = 'DOCBASE1.DOCBASE1@VMCS1.DBI-SERVICES-COM'

SELECT ALL r_object_id, a_next_invocation FROM dm_job WHERE ( (run_now = 1) OR ((is_inactive = 0) AND ( ( a_next_invocation <= DATE('now') AND a_next_invocation IS NOT NULLDATE ) OR ( a_next_continuation  DATE('now')) OR (expiration_date IS NULLDATE)) AND ((max_iterations = 0) OR (a_iterations < max_iterations))) ) AND (i_is_reference = 0 OR i_is_reference is NULL) AND (i_is_replica = 0 OR i_is_replica is NULL) AND UPPER(target_server) = 'DOCBASE1.DOCBASE1@VMCS1.DBI-SERVICES-COM' ORDER BY run_now DESC, a_next_invocation, r_object_id ENABLE (RETURN_TOP 3 )

I executed the second query and it found three jobs (RETURN_TOP 3) which are from the application team. As the three selected jobs have an old a_next_invocation value, they will never run and will always be selected when the job is executed and unfortunately this means my dm_ContentWarning job will never be selected for automatic execution.

I informed the application team that I will keep only one job active (dm_ContentWarning) to see if the job will run. And guess what, it ran … YES!

Okay, now we have the solution:

  • reactivate all previously deactivated job
  • set the a_next_invocation to a future date

And do not forget to deactivate the trace for the dm_agent_exec.

Cet article What to do in case all active Documentum jobs are no more running ? est apparu en premier sur Blog dbi services.


An exotic feature in the content server: check_client_version

$
0
0

An exotic feature in the content server: check_client_version

A few months ago, I tripped over a very mysterious problem while attempting to connect to a 7.3 CS docbase from within dqMan.
We had 3 docbases and we could connect using this client to all of them but one ! Moreover, we could connect to all three using a remote Documentum Administrator or the local idql/iapi command-line tools. Since we could connect to at least one of them with dqMan, this utility was not guilty. Also, since all three docbases accepted connections, they were all OK in this respect. Ditto for the account used, dmadmin or nominative ones; local connections were possible hence the accounts were all active and, as they could be used from within the remote DA, their identification method and password were correct too.
We tried connecting from different workstations in order to check the dqMan side, we cleared its caches, we reinstalled it, but to no avail. We checked the content server’s log, as usual nothing relevant. It was just the combination of this particular docbase AND dqMan. How strange !
So what the heck was wrong here ?
As we weren’t the only administrators of those repositories, we more or less suspecting someone else change some setting but which one ? Ok, I sort of gave it away in the title but please bear with me and read on.
I don’t remember exactly how, we were probably working in panic mode, but we eventually decided to compare the docbases’ dm_docbase_config object side by side as shown below (with some obfuscation):

paste <(iapi bad_boy -Udmadmin -Pxxx <<eoq | awk '{print substr($0, 1, 80)}'
retrieve,c,dm_docbase_config
dump,c,l
quit
eoq
) <(iapi good_boy -Udmadmin -Pxxx <<eoq | awk '{print substr($0, 1, 80)}'
retrieve,c,dm_docbase_config
dump,c,l
quit
eoq
) | column -c 30 -s $'\t' -t | tail +11 | head -n 48
USER ATTRIBUTES                                          USER ATTRIBUTES
  object_name                     : bad_boy                object_name                     : good_boy
  title                           : bad_boy Repository     title                           : good_boy Global Repository
  subject                         :                        subject                         :
  authors                       []:                        authors                       []: 
  keywords                      []:                        keywords                      []: 
  resolution_label                :                        resolution_label                :
  owner_name                      : bad_boy                owner_name                      : good_boy
  owner_permit                    : 7                      owner_permit                    : 7
  group_name                      : docu                   group_name                      : docu
  group_permit                    : 5                      group_permit                    : 5
  world_permit                    : 3                      world_permit                    : 3
  log_entry                       :                        log_entry                       :
  acl_domain                      : bad_boy                acl_domain                      : good_boy
  acl_name                        : dm_450xxxxx80000100    acl_name                        : dm_450xxxxx580000100
  language_code                   :                        language_code                   :
  mac_access_protocol             : nt                     mac_access_protocol             : nt
  security_mode                   : acl                    security_mode                   : acl
  auth_protocol                   :                        auth_protocol                   :
  index_store                     : DM_bad_boy_INDEX       index_store                     : DM_good_boy_INDEX
  folder_security                 : T                      folder_security                 : T
  effective_date                  : nulldate               effective_date                  : nulldate
  richmedia_enabled               : T                      richmedia_enabled               : T
  dd_locales                   [0]: en                     dd_locales                   [0]: en
  default_app_permit              : 3                      default_app_permit              : 3
  oldest_client_version           :                        oldest_client_version           :
  max_auth_attempt                : 0                      max_auth_attempt                : 0
  client_pcaching_disabled        : F                      client_pcaching_disabled        : F
  client_pcaching_change          : 1                      client_pcaching_change          : 1
  fulltext_install_locs        [0]: dsearch                fulltext_install_locs        [0]: dsearch
  offline_sync_level              : 0                      offline_sync_level              : 0
  offline_checkin_flag            : 0                      offline_checkin_flag            : 0
  wf_package_control_enabled      : F                      wf_package_control_enabled      : F
  macl_security_disabled          : F                      macl_security_disabled          : F
  trust_by_default                : T                      trust_by_default                : T
  trusted_docbases              []:                        trusted_docbases              []: 
  login_ticket_cutoff             : nulldate               login_ticket_cutoff             : nulldate
  auth_failure_interval           : 0                      auth_failure_interval           : 0
  auth_deactivation_interval      : 0                      auth_deactivation_interval      : 0
  dir_user_sync_on_demand         : F                      dir_user_sync_on_demand         : F
  check_client_version            : T                      check_client_version            : F
  audit_old_values                : T                      audit_old_values                : T
  docbase_roles                 []:                        docbase_roles                [0]: Global Registry
  approved_clients_only           : F                      approved_clients_only           : F
  minimum_owner_permit            : 2                      minimum_owner_permit            : 0
  minimum_owner_xpermit           :                        minimum_owner_xpermit           :
  dormancy_status                 :                        dormancy_status                 :

The only significant differences were the highlighted ones and the most obvious one was the attribute check_client_version, it was turned on in the bad_boy repository. Now that we finally had something to blame, the universe started making sense again ! We quickly turned this setting to false and could eventually connect to that recalcitrant docbase. But the question is still open: check against what ? What criteria was applied to refuse dqman access to bad_boy but to allow it to good_boy ? That was still not clear, even though we could work around it.
Now, who and why turned it on, that had to remain a mystery.
While we were at it, we also noticed another attribute which seemed to be related to the previous one: oldest_client_version.
Was there any other client_% attribute in dm_docbase_config ?

paste <(iapi good_boy -Udmadmin -Pdmadmin <<eoq | grep client
retrieve,c,dm_docbase_config
dump,c,l
quit
eoq) <(iapi bad_boy -Udmadmin -Pdmadmin <<eoq | grep client
retrieve,c,dm_docbase_config
dump,c,l
quit
eoq) | column -s $'\t' -t
  oldest_client_version           :      oldest_client_version           : 
  client_pcaching_disabled        : F    client_pcaching_disabled        : F
  client_pcaching_change          : 1    client_pcaching_change          : 1
  check_client_version            : F    check_client_version            : T
  approved_clients_only           : F    approved_clients_only           : F

Yes, but they looked quite harmless in the current context.
Thus, the relevant attributes here are check_client_version and oldest_client_version. Let’s discover a bit more about them.

Digging

As usual, the documentation is a bit scketchy about these attributes:

check_client_version Boolean S T means that the repository
                               servers will not accept connections
                               from clients older than the
                               version level specified in the
                               oldest_client_version property.
                               F means that the servers accept
                               connections from any client version.
                               The default is F.

oldest_client _version string(32) S Version number of the oldest
                                    Documentum client that will access
                                    this repository.
                                    This must be set manually. It is used
                                    by the DFC to determine how to
                                    store chunked XML documents. If
                                    check_client_version is set to T,then
                                    this value is also used to identify the
                                    oldest client version level that may
                                    connect to the repository.

But what is the client version ? Logically, it is the version of its DfCs or, for older clients, the version of the dmcl shared library.
So, if check_client_version is true, the client version is checked and if it is older than the one defined in oldest_client_version, the client is forbidden to connect. That makes sense except that in our case, oldest_client_version was empty. Maybe in such a case, the client has to match exactly the content server’s DfC version ? As dqMan was either using the dmcl40.dll library or an old Dfc version, it was rejected. Let’s verify these hypothesis with a 16.4 target repository.
Connecting from an ancient 5.3 client
We exhumed an old 5.3 CS installation to use its client part with the default configuration in the target docbase:

dmadmin@osboxes:~/documentum53$ idql dmtest -Udmadmin -Pdmadmin
 
 
Documentum idql - Interactive document query interface
(c) Copyright Documentum, Inc., 1992 - 2004
All rights reserved.
Client Library Release 5.3.0.115 Linux
 
 
Connecting to Server using docbase dmtest
[DM_SESSION_I_SESSION_START]info: "Session 0100c35080003913 started for user dmadmin."
 
 
Connected to Documentum Server running Release 16.4.0080.0129 Linux64.Oracle

Fine so far.
Let’s activate the dm_docbase_config.check_client_version in the target:

retrieve,c,dm_docbase_config
...
set,c,l,check_client_version
SET> T
...
OK
API> save,c,l
...
[DM_DCNFG_E_CANT_SAVE]error: "Cannot save dmtest docbase_config."
 
[DM_DCNFG_E_SET_OLDEST_CLIENT_VERSION_FIRST]error: "The docbase_config object attribute oldest_client_version has to be set before setting attribute check_client_version to T."

Interesting. At that time, this attribute was empty and yet the check_client_version was active. Is this constraint new in 16.4 or did the unknow administrator hack around this ? As I don’t have a 7.x repository available right now, I cannot test this point.
Let’s play by the rules and set oldest_client_version:

reset,c,l
set,c,l,oldest_client_version
16.4
save,c,l
OK
set,c,l,check_client_version
SET> T
...
OK
API> save,c,l
...
OK

Try connecting from the 5.3 client: still OK.
Maybe a reinit is necessary to actuate the changes:

reinit,c

Try again:

dmadmin@osboxes:~/documentum53$ idql dmtest -Udmadmin -Pdmadmin
&nbps;
&nbps;
Documentum idql - Interactive document query interface
(c) Copyright Documentum, Inc., 1992 - 2004
All rights reserved.
Client Library Release 5.3.0.115 Linux
 
 
Connecting to Server using docbase dmtest
Could not connect
[DM_SESSION_E_START_FAIL]error: "Server did not start session. Please see your system administrator or check the server log.
Error message from server was:
[DM_SESSION_E_AUTH_FAIL]error: "Authentication failed for user dmadmin with docbase dmtest."
 
"

So a reinit it required indeed.
Note the misleading error, it is not the authentication that is wrong but the client version validation. It is such wrong messages that make diagnosis of Documentum problems so hard and time-consuming. Anyway, let’s revert the check_client_version to F:

set,c,l,check_client_version
F
save,c,l
reinit,c

Try connecting: OK. So, the client version filtering is effective. Let’s try it with a 5.3 client version:

API> set,c,l,oldest_client_version
SET> 5.3
...
OK
API> save,c,l
...
OK
API> set,c,l,check_client_version
SET> T
...
OK
API> save,c,l
...
OK
API> reinit,c
...
OK

Try connecting: OK, that’s expected.
Let’s try it for a minimum 5.2 client version: it still works, which is expected too since the test client’s version is 5.3 and in my books 5.3 > 5.2.
Let’s try it for a miminum a 5.4 client version: the connection fails, so client version checking works as expected here.
Let’s try it for a miminum a 20.0 client version: the connection fails as expected. No check on the version’s value is done, which is quite understandable programmatically speaking, although a bit optimistic in the context of the turmoil Documentum went through lately.
Let’s go back to a more realistic value:

API> set,c,l,oldest_client_version
SET> 7.2
...
[DM_SESSION_E_AUTH_FAIL]error: "Authentication failed for user dmadmin with docbase dmtest."
 
 
API> save,c,l

Oops, interestingly, the last change did not make it because with the current setting so down the way into the future, the present client’s session was disconnected and there is no way to reconnect in order to revert it !
Let’s do the rollback from the database level directly:

sqlplus dmtest@orcl
 
SQL*Plus: Release 12.2.0.1.0 Production on Mon Jun 10 16:25:56 2019
 
Copyright (c) 1982, 2016, Oracle. All rights reserved.
 
Enter password:
Last Successful login time: Mon Jun 10 2019 16:25:40 +02:00
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
 
SQL> update dm_docbase_config_s set check_client_version = 0;
 
1 row updated.
SQL> commit;
 
Commit complete.
 
quit;

Try to connect:

iapi dmtest@docker:1489
Please enter a user (dmadmin):
Please enter password for dmadmin:
 
 
OpenText Documentum iapi - Interactive API interface
Copyright (c) 2018. OpenText Corporation
All rights reserved.
Client Library Release 16.4.0070.0035
 
 
Connecting to Server using docbase dmtest
[DM_SESSION_E_AUTH_FAIL]error: "Authentication failed for user dmadmin with docbase dmtest."

Still not ok because the reinit is missing but for this we need to connect which we still cannot because of the missing reinit. To break this catch-22 situation, let’s cut the gordian knot and kill the dmtest docbase’s processes:

dmadmin@docker:~$ ps ajxf | grep dmtest
1 27843 27843 27843 ? -1 Ss 1001 0:00 ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27849 27843 27843 ? -1 S 1001 0:00 \_ /app/dctm/product/16.4/bin/mthdsvr master 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 27843 dmtest /app/dctm/dba/log
27849 27850 27843 27843 ? -1 Sl 1001 0:03 | \_ /app/dctm/product/16.4/bin/mthdsvr worker 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 0 dmtest /app/dctm/dba/log
27849 27861 27843 27843 ? -1 Sl 1001 0:03 | \_ /app/dctm/product/16.4/bin/mthdsvr worker 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 1 dmtest /app/dctm/dba/log
27849 27874 27843 27843 ? -1 Sl 1001 0:03 | \_ /app/dctm/product/16.4/bin/mthdsvr worker 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 2 dmtest /app/dctm/dba/log
27849 27886 27843 27843 ? -1 Sl 1001 0:03 | \_ /app/dctm/product/16.4/bin/mthdsvr worker 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 3 dmtest /app/dctm/dba/log
27849 27899 27843 27843 ? -1 Sl 1001 0:03 | \_ /app/dctm/product/16.4/bin/mthdsvr worker 0xe901fd2f, 0x7f8a50658000, 0x223000 50000 5 4 dmtest /app/dctm/dba/log
27843 27862 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27863 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27875 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27887 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27901 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini
27843 27944 27843 27843 ? -1 Sl 1001 0:06 \_ ./dm_agent_exec -docbase_name dmtest.dmtest -docbase_owner dmadmin -sleep_duration 0
27843 27962 27843 27843 ? -1 S 1001 0:00 \_ ./documentum -docbase_name dmtest -security acl -init_file /app/dctm/dba/config/dmtest/server.ini

and:

kill -9 -27843

After restarting the docbase, the connectivity was restored.
So, be cautious while experimenting ! Needless to say, avoid doing it in a production docbase or in any heavily used development docbase for that matter, or the wrath of the multiverses and beyond will fall upon you and you will be miserable for ever.
Connecting from a 7.3 client
The same behavior and error messages as with the precedent 5.3 client were observed with a more recent 7.3 client and, inferring from the incident above, later clients behave the same way.

Conclusion

We never stop learning stuff with Documentum ! While this client version limiting feature looks quite exotic, it may make sense in order to avoid surprises or even corruptions when using newly implemented extensions or existing but changed areas of the content server. It is possible that new versions of the DfCs behave differently from older ones in dealing with the same functionalities and Documentum had no better choice but to cut the older versions off to prevent any conflict. As usual, the implementation looks a bit hasty with inapt error messages costing hours of investigation and the risk to cut oneself off a repository.

Cet article An exotic feature in the content server: check_client_version est apparu en premier sur Blog dbi services.

Connecting to a Repository via a Dynamically Edited dfc.properties File (part I)

$
0
0

Connecting to a Repository via a Dynamically Edited dfc.properties File

Now that we have containerized content servers, it is very easy, maybe too easy, to create new repositories. Their creation is still not any faster (whether they are containerized or not is irrelevant here) but given a configuration file it just takes one command to instantiate an image into a running container with working repositories in it. Thus, during experimentation and testing, out of laziness or in a hurry, one can quickly finish up having several containers with identically named repositories, e.g. dmtest01, with an identically named docbroker, e.g. docbroker01. Now, suppose one wants to connect to the docbase dmtest01 running on the 3rd such container using the familiar command-line tools idql/iapi/dmawk. How then to select that particular instance of dmtest01 among all the others ?
To precise the test case, let’s say that we are using a custom bridge network to link the containers together on the docker host (appropriately named docker) which is a VirtualBox VM running an Ubuntu flavor. The metal also runs natively the same Ubuntu distro. It looks complicated but actually matches the common on-premises infrastructure type where the metal is an ESX or equivalent, its O/S is the hypervisor and the VMs run a Redhat or Suse distro. As this is a local testing environment, no DNS or network customizations have been introduced save for the custom bridge.
We want to reach a remote repository either from container to container or from container to host or from host to container.
The problem here stems from the lack of flexibility in the docbroker/dfc.properties file mechanism and no network fiddling can work around this.

It’s All in The dfc.properties File

Containers have distinct host names, so suffice it to edit their local dfc.properties file and edit this field only. Their file may all look like the one below:

dfc.docbroker.host[0]=container01
dfc.docbroker.port[0]=1489
dfc.docbroker.host[1]=docker
dfc.docbroker.port[1]=1489
dfc.docbroker.host[3]=container011
dfc.docbroker.port[3]=1489
dfc.docbroker.host[4]=container02
dfc.docbroker.port[4]=1489

In effect, the custom bridge network embeds a DNS for all the attached containers, so their host names are known to each other (but not to the host so IP address must be used from there or the host’s /etc/hosts file must be edited). The docbroker ports are the ones inside the containers and have all the same value 1489 because they were created out of the same configuration files. The docker entry has been added to the containers’ /etc/host file via the ––add-host= clause of the docker run’s command.
For the containers’ host machine, where a Documentum repository has been installed too, the dfc.properties file could look like this one:

dfc.docbroker.host[0]=docker
dfc.docbroker.port[0]=1489
dfc.docbroker.host[1]=docker
dfc.docbroker.port[1]=2489
dfc.docbroker.host[3]=docker
dfc.docbroker.port[3]=3489
dfc.docbroker.host[4]=docker
dfc.docbroker.port[4]=5489

Here, the host name is the one of the VM where the containers sit and is the same for all the containers. The port numbers differ because they are the external container’s port which are published in the host VM and mapped to the respective docbroker’s internal port, 1489. Since the containers share the same custom network, their host names, IP addresses and external ports must all be different when running the image, or docker won’t allow it.
Alternatively, the container’s IP addresses and internal docbroker’s ports could be used directly too if one is too lazy to declare the containers’ host names in the host’s /etc/hosts file, which is generally the case when testing:

dfc.docbroker.host[0]=docker 
dfc.docbroker.port[0]=1489
dfc.docbroker.host[1]=192.168.33.101
dfc.docbroker.port[1]=1489
dfc.docbroker.host[2]=192.168.33.102
dfc.docbroker.port[2]=1489
dfc.docbroker.host[3]=192.168.33.104
dfc.docbroker.port[3]=1489

The host’s custom network will take care of routing the traffic into the respective containers.
Can you spot the problem now ? As all the containers contain identically named repositories (for clarity, let’s say that we are looking for the docbase dmtest01), the first contacted docbroker in that file will always reply successfully because there is indeed a dmtest01 docbase in that container and consequently one will always be directed to the docbase container01.dmtest01. If one wants to contact container03.dmtest01, this configuration won’t let do it. One would need to edit it and move the target container03 host in the first position, which is OK until one wants to access container02.dmtest01 or go back to container01.dmtest01.
This situation has been existing forever but containers make it more obvious because they make it so much easier to have repository homonyms.
So is there a simpler way to work around this limitation than editing back and forth a configuration file or giving different names to the containerized repositories ?

A Few Reminders

Documentum has made quite a lot of design decisions inspired by the Oracle DBMS but their implementation is far from offering the same level of flexibility and power, and this is often irritating. Let’s consider the connectivity for example. Simply speaking, Oracle’s SQL*Net configuration relies mainly on a tnsnames.ora file for the connectivity (it can also use a centralized ldap server but let’s keep it simple). This file contains entries used to contact listeners and get the information needed to connect to the related database. Minimal data to provide in the entries are the listener’s hostname and port, and the database sid or service name, e.g.:

...
ORCL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = db)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = db_service)
    )
  )
...

A connection to the database db_service can simply be requested as follows:

sqlplus scott@orcl

orcl is the SQL*Net alias for the database served by db_service. It works like an index in a lookup table, the tnsnames.ora file.
Compare this with a typical dfc.properties file, e.g. /home/dmadmin/documentum/shared/config/dfc.properties:

...
dfc.docbroker.host[0]=docker
dfc.docbroker.port[0]=1489
dfc.docbroker.host[1]=dmtest
dfc.docbroker.port[1]=1489
...

Similarly, instead of contacting listeners, we have here docbrokers. A connection to the docbase dmtest can be requested as follows:

idql dmtest

dmtest is the target repository. It is not a lookup key in the dfc.properties file. Unlike the tnsnames.ora file and its aliases, there is an indirection here and the dfc.properties file does not directly tell where to find a certain repository, it just lists the docbrokers to be sequentially queried about it until the first one that knows the repository (or an homonym thereof) answers. If the returned target docbase is the wrong homonym, tough luck, it will not be reachable, unless the order of the entries is changed. Repositories announces themselves to the docbrokers by “projecting” themselves. If two repositories by the same name project to the same docbroker, no error is raised but the docbroker can return unexpected results, e.g. one may finish up in the unintended docbase.
Another major difference is that with Oracle but not with Documentum, it is possible to bypass the tnsnames.ora file by specifying the connection data in-line, e.g. on the command-line:

sqlplus scott@'(DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = db)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = db_service)
    )
  )'

This can be very useful when editing the local, official listener.ora file is not allowed, and sometimes faster than setting $TNS_ADMIN to an accessible local directory and editing a private listener.ora file there.
This annoyance is even more frustrating because Documentum’s command-line tools do support a similar syntax but for a different purpose:

idql repository[.service][@machine] [other parameters]

While this syntax is logically useful to access the service (akin to an Oracle’s instance but for a HA Documentum installation), it is used in a distributed repository environment to contact a particular node’s docbroker; however, it still does not work if that docbroker is not first declared in the local dfc.properties file.
Last but not least, one more reason to be frustrated is that the DfCs do allow to choose a specific docbroker when opening a session, as illustrated by the jython snippet below:

import traceback
import com.documentum.fc.client as DFCClient
import com.documentum.fc.common as DFCCommon

docbroker_host = "docker"
docbroker_port = "1489"
docbase = "dmtest"
username = "dmadmin"
password = "dmadmin"
print("attempting to connect to " + docbase + " as " + username + "/" + password + " via docbroker on host " + docbroker_host + ":" + docbroker_port)
try:
  client = DFCClient.DfClient.getLocalClient()

  config = client.getClientConfig()
  config.setString ("primary_host", docbroker_host)
  config.setString ("primary_port", docbroker_port)

  logInfo = DFCCommon.DfLoginInfo()
  logInfo.setUser(username)
  logInfo.setPassword(password)
  docbase_session = client.newSession(docbase, logInfo)

  if docbase_session is not None:
    print("Connected !")
  else:
    print("Couldn't connect !")
except Exception:
  traceback.print_exc()

Content of dfc.properties:

$ cat documentum/shared/config/dfc.properties
dfc.date_format=dd.MM.yyyy HH:mm:ss

Execution:

$ jython ./test.jy
...
attempting to connect to dmtest as dmadmin/dmadmin via docbroker docker
Connected !

Despite a dfc.properties file devoid of any docbroker definition, the connection was successful. Unfortunately, this convenience has not been carried over to the vegetative command-line tools.
While we can dream and hope for those tools to be resurrected and a backport miracle to happen (are you listening OTX ?), the next best thing is to tackle ourselves this shortcoming and implement an as unobtrusive as possible solution. Let’s see how.

A few Proposals

Currently, one has to manually edit the local dfc.properties file, but this is tedious to say the least, because changes must sometimes be done twice, forwards and rolled back if the change is only temporary. To avoid this, we could add at once in our local dfc.properties file all the machines that host repositories of interest but this file could quickly grow large and it won’t solve the case of repository homonyms. The situation would become quite unmanageable although an environment variable such as the late DMCL_CONFIG (appropriately revamped e.g. to DFC_PROPERTIES_CONFIG for the full path name of the dfc.properties file to use) could help here to organize those entries. But there is not such a variable any longer for the command-line tools (those tools have stopped evolving since CS v6.x) although there is a property for the DfCs to pass to the JVM at startup, -Ddfc.properties.file, or even the #include clause in the dfc.properties file, or playing with the $CLASSPATH but there is a better way.
What about an on-the-fly, transparent, behind the scenes dfc.properties file editing to support a connection syntax similar to the Oracle’s in-line one ?
Proposal 1
Let’s specify the address of the docbroker of interest directly on the command-line, as follows:

$ idql dmtest01@container03:3489
or
$ idql dmtest01@192.168.33.104:3489

This is more akin to Oracle in-line connection syntax above.
Proposal 2
An alternative could be to use an Oracle’s tnsnames.ora-like configuration file such as the one below (and (in (keeping (with (the (lisp spirit)))))):

dmtest01 = ((docbroker.host = container01) (docbroker.port = 1489))
dmtest02 = ((docbroker.host = container02) (docbroker.port = 1489))
dmtest03 = ((docbroker.host = container03) (docbroker.port = 1489))

and to use it thusly:

$ idql dmtest01@dmtest03

dmtest03 is looked up in the configuration file and replaced on the command-line by its definition.
Proposal 3
With a more concise configuration file that can also be sourced:

dmtest01=container01:1489
dmtest02=container02:1489
dmtest03=container03:1489

and used as follows:

$ export REPO_ALIAS=~/repository_connections.aliases
$ . $REPO_ALIAS
$ ./widql dmtest01@$dmtest03

$dmtest03 is directly fetched from the environment after the configuration file has been sourced, which is equivalent to a lookup. Since the variable substitution occurs at the shell level, it comes free of charge.
With a bit more generalization, it is possible to merge the three proposals together:

$ idql repository(@host_literal:port_number) | @$target

In other words, one can either provide literally the full connection information or provide a variable which will be resolved by the shell from a configuration file to be sourced preliminarily.
Let’s push the configuration file a bit farther and define complete aliases up to the repository name like this:

dmtest=dmtest@docker:1489
or even so:
dmtest=dmtest:docker:1489

Usage:

$ ./widql $dmtest

The shell will expand the alias with its definition. The good thing is the definition styles can be mixed and matched to suit one’s fantasy. Example of a configuration file:

# must be sourced prior so the environment variables can be resolved;
# this is a enhancement over the dfc.properties file syntax used by the dctm_wrapper utility:
# docbroker.host[i]=...
# docbroker.port[i]=...
# it supports several syntaxes:
# docbroker only definition docbroker_host:port;
#    usage: ./widql dmtest@$dmtest
# full definition docbase[@[docbroker_host]:[port]]
#    usage: ./widql $test
# alternate ':' separator docbase:[docbroker_host]:[port];
#    usage: ./widql $dmtestVM
# alias literal;
#    usage: ./widql test
# in order to resolve alias literals, the wrapper will source the configuration file by itself;

# docker.dmtest;
# docbroker only definition;
d_dmtest=docker:1489
# full definition;
f_dmtest=dmtest@docker:1489
# alternate ':' separator;
a_dmtest=dmtest:docker:1489

# container01.dmtest01;
# docbroker only definition;
d_dmtest01=container01:2489
dip_dmtest01=192.168.33.101:1489
# full definition;
f_dmtest01=dmtest01@container01:2489
fip_dmtest01c=dmtest01@192.168.33.101:1489
# alternate ':' separator;
a_dmtest01=dmtest01:container01:2489
aip_dmtest01=dmtest01:192.168.33.101:2489

# container011.dmtest01;
# docbroker only definition;
d_dmtest011=container011:5489
dip_dmtest011=192.168.33.104:1489
# full definition;
f_dmtest011=dmtest01@container011:2489
fip_dmtest011=dmtest01@192.168.33.104:1489
# alternate ':' separator;
a_dmtest011=dmtest01:container011:2489
aip_dmtest011=dmtest01:192.168.33.104:2489

Lines 5 to 14 explains all the supported target syntaxes with a new one presented on lines 12 to 14, which will be explained later in the paragraph entitled Possible Enhancements.
Using lookup variables in a configuration file makes things easier when the host names are hard to remember because better mnemonic aliases can be defined for them. Also, as they are looked up, the entries can be in any order. They must obviously be unique or they will mask each other. A consistent naming convention may be required to easily find one own’s way into this file.
Whenever the enhanced syntax is used, it triggers an automatic editing of the dfc.properties file and the specified connection information is inserted as dfc.docbroker.host and dfc.docbroker.port entries. Then, the corresponding Documentum tool gets invoked and finally the original dfc.properties file is restored when the tool exits. The trigger here is the presence of the @ or : characters in the first command-line parameter.
This would also cover the case when an entry is simply missing from the dfc.properties file. Actually, from the point of view of the command-line tools, all the connection definitions could be handled over to the new configuration file and even removed from dfc.properties as they are dynamically added to and deleted from the latter file as needed.

The Implementation

The above proposal looks pretty easy and fun to implement, so let’s give it a shot. In this article, I’ll present a little script, dctm_wrapper, that builds upon the above @syntax to first edit the configuration file on demand (that’s the dynamic part of the article’s title) and then invoke the standard idql, iapi or dmawk utilities, with an optional rollback of the change on exiting.
Since it is not possible to bypass the dfc.properties files, we will dynamically modify it whenever the @host syntax is used from a command-line tool. As we do no want to replace the official idql, iapi and dmawk tools, yet, we will create new ones, say widql, wiapi and wdmawk (where w stands for wrapper). Those will be symlinks to the real script, dctm-wrapper.sh, which will be able to invoke either idql, iapi or dmawk according to how it was called (bash’s $0 contains the name of the symlink that was invoked, even though its target is always dctm-wrapper.sh, see the script’s source at the next paragraph).
The script dctm-wrapper.sh will support the following syntax:

$ ./widql docbase[@[host][:port]] [other standard parameters] [--verbose] [--append] [--keep] $ ./wiapi docbase[@[host][:port]] [other standard parameters] [--verbose] [--append] [--keep] $ ./wdmawk [-v] docbase[@[host][:port]] [dmawk parameters] [--verbose] [--append] [--keep]

The custom parameters ––verbose, ––append and ––keep are processed by the script and stripped off before invoking the official tools.
wdmawk is a bit special in that the native tool, dmawk, is invoked differently from iapi/idql but I felt that it too could benefit from this little hack. Therefore, in addition to the non-interactive editing of the dfc.properties file, wdmawk also passes on the target docbase name as a -v docbase=… command-line parameter (the standard way to pass parameters in awk) and removes the extended target parameter docbase[@[host][:port]] unless it is prefixed by the -v option in which case it gets forwarded through the -v repo_target= parameter. The dmawk program is then free to use them the way it likes. The repo_target parameter could have been specified on the command-line independently but the -v option can still be useful in cases such as the one below:

$ ./wdmawk docbase@docker:1489 -v repo_target=docbase@docker:1489 '{....}'

which can be shortened to

$ ./wdmawk -v docbase@docker:1489 '{....}'

If the extended target docbase parameter is present, it must be the first one.
If the ‘@’ or ‘:’ characters are missing, it means the enhanced syntax is not used and the script will not attempt to modify dfc.properties; it will pass on all the remaining parameters to the matching official tools.
When @[host][:port] is present, the dfc.properties file will be edited to accommodate the new docbroker’s parameters; all the existing couples dfc.docbroker.host/dfc.docbroker.port will either be removed (if ––append is missing) or preserved (if ––append is present) and a new couple entry will be appended with the given values. Obviously, if one want to avoid the homonym trap, ––append should not be used in order to let the given docbroker be picked up as the sole entry in the property file.
When ––append and ––keep are present, we end up with a convenient way to add docbroker entries into the property file without manually editing it.
As the host is optional, it can be omitted and the one from the first dfc.docbroker.host[] entry will be used instead. Ditto for the port.
Normally, upon returning from the invocation of the original tools, the former dfc.properties file is restored to its original content. However, if ––keep is mentioned, the rollback will not be performed and the modified file will replace the original file. The latter will still be there though but renamed to $DOCUMENTUM_SHARED/config/dfc.properties_saved_YY-MM-DD_HH:MI:SS so it will still be possible to manually roll back. ––keep is mostly useful in conjunction with ––append so that new docbrokers get permanently added to the configuration file.
Finally, when ––verbose is specified, the changes to the dfc.properties file will be sent to stdout; a diff of both the original and the new configuration file will also be shown, along with the final command-line used to invoke the selected original tool. This helps troubleshooting possible command-line parsing issues because, as it can be seen from the code, no extra-effort has been put into this section.

The Code

The script below shows a possible implementation:

#!/bin/bash
# Installation:
# it should not be called directly but through one of the aliases below for the standard tools instead:
# ln -s dctm-wrapper wiapi
# ln -s dctm-wrapper widql
# ln -s dctm-wrapper wdmawk
# where the initial w stands for wrapper;
# and then:
#    ./widql ...
# $DOCUMENTUM_SHARED must obviously exist;
# Since there is no \$DOCUMENTUM_SHARED in eCS ≥ 16.4, set it to $DOCUMENTUM as follows:
#    export DOCUMENTUM_SHARED=$DOCUMENTUM
# See Usage() for details;

Usage() {
   cat - >>EoU
./widql docbase[@[host][:port]] [other standard parameters] [--verbose] [--append] [--keep]
./wiapi docbase[@[host][:port]] [other standard parameters] [--verbose] [--append] [--keep]
./wdmawk [-v] docbase[@[host][:port]] [dmawk -v parameters] [--verbose] [--append] [--keep]
E.g.:
   wiapi dmtest
or:
   widql dmtest@remote_host
or:
   widql dmtest@remote_host:1491 -Udmadmin -Pxxxx
or:
   wiapi dmtest@:1491 --append
or:
   wdmawk -v dmtest01@docker:5489 -f ./twdmawk.awk -v ...
or:
   wdmawk dmtest01@docker:2489 -f ./twdmawk.awk -v ...
or:
   wiapi dmtest@remote_host:1491 --append --keep
etc...
If --verbose is present, the changes applied to \$DOCUMENTUM[_SHARED]/config/dfc.properties are displayed.
If --append is present, a new entry is appended to the dfc.properties file, the value couple dfc.docbroker.host and dfc.docbroker.port, and the existing ones are not commented out so they are still usable;
If --append is not present, all the entries are removed prior to inserting the new one;
If --keep is present, the changed dfc.properties file is not reverted to the changed one, i.e. the changes are made permanent;
If a change of configuration has been requested, the original config file is first saved with a timestamp appended and restored on return from the standard tools, unless --keep is present in which case
the backup file is also kept so it is still possible to manually revert to the original configuration;
wdmawk invokes dmawk passing it the -v docbase=$docbase command-line parameter;
In addition, if -v docbase[@[host][:port]] is used, -v repo_target=docbase[@[host][:port]] is also passed to dmawk;
Instead of a in-line target definition, environment variables can also be used, e.g.:
   widql dmtest@$dmtestVM ...
where $dmtestVM resolves to e.g. docker:1489
or even:
   widql $test01c ...
where $test01c resolves to e.g. dmtest01@container01:1489
As the environment variable is resolved by the shell before it invokes the program, make sure it has a definition, e.g. source a configuration file;
EoU
   exit 0
}

if [[ $# -eq 0 ]]; then
   Usage
fi

# save command;
current_cmd="$0 $*"

# which original program shall possibly be called ?
dctm_program=$(basename $0); dctm_program=${dctm_program:1}
if [[ $dctm_program == "dmawk" ]]; then
   bFordmawk=1 
else
   bFordmawk=0 
fi

# look for the --verbose, --append or --keep options;
# remove them from the command-line if found so they are not passed to the standard Documentum's tools;
# the goal is to clean up the command-line from the enhancements options so it can be passed to the official tools;
bVerbose=0
bAppend=0
bKeep=0
posTarget=1
passTarget2awk=0
while true; do
   index=-1
   bChanged=0
   for i in "$@"; do
      (( index += 1 ))
      if [[ "$i" == "--verbose" ]]; then
         bVerbose=1
         bChanged=1
         break
      elif [[ "$i" == "--append" ]]; then
         bAppend=1
         bChanged=1
         break
      elif [[ "$i" == "--keep" ]]; then
         bKeep=1
         bChanged=1
         break
      elif [[ "$i" == "-v" && $bFordmawk -eq 1 && $index -eq 0 ]]; then
	 passTarget2awk=1
         bChanged=1
         break
      fi
   done
   if [[ $bChanged -eq 1 ]]; then
      set -- ${@:1:index} ${@:index+2:$#-index-1}
   else
      break
   fi
done

[[ bVerbose -eq 1 ]] && echo "current_cmd=[$current_cmd]"

target=$1
remote_info=$(echo $1 | gawk '{
   docbase = ""; hostname = ""; port = ""
   if (match($0, /@[^ \t:]*/)) {
      docbase = substr($0, 1, RSTART - 1)
      hostname = substr($0, RSTART + 1, RLENGTH - 1)
      rest = substr($0, RSTART + RLENGTH)
      if (1 == match(rest, /:[0-9]+/))
         port = substr(rest, 2, RLENGTH - 1)
   }
   else docbase = $0
}
END {
   printf("%s:%s:%s", docbase, hostname, port)
}')
docbase=$(echo $remote_info | cut -d: -f1)
hostname=$(echo $remote_info | cut -d: -f2)
port=$(echo $remote_info | cut -d: -f3)

# any modifications to the config file requested ?
if [[ ! -z $hostname || ! -z $port ]]; then
   # the dfc.properties file must be changed for the new target repository;
   dfc_config=$DOCUMENTUM_SHARED/config/dfc.properties
   if [[ ! -f $dfc_config ]]; then
      echo "$dfc_config not found"
      echo "check the \$DOCUMENTUM_SHARED environment variable"
      echo " in ≥ 16.4, set it to \$DOCUMENTUM"
      exit 1
   fi
   
   # save the current config file;
   backup_file=${dfc_config}_saved_$(date +"%Y-%m-%d_%H:%M:%S")
   cp $dfc_config ${backup_file}

   [[ $bVerbose -eq 1 ]] && echo "changing to $hostname:$port..."
   pid=$$; gawk -v hostname="$hostname" -v port="$port" -v bAppend=$bAppend -v bVerbose=$bVerbose -v bKeep=$bKeep -v pid=$$ 'BEGIN {
      bFirst_hostname = 0; first_hostname = ""
      bFirst_port     = 0 ;    first_port = ""
      max_index = -1
   }
   {
      if (match($0, /^dfc.docbroker.host\[[0-9]+\]=/)) {
         if (!hostname && !bFirst_hostname) {
            # save the first host name to be used if command-line hostname was omitted;
            bFirst_hostname = 1
            first_hostname = substr($0, RLENGTH +1)
         }
         match($0, /\[[0-9]+\]/); index_number = substr($0, RSTART + 1, RLENGTH - 2)
         if (bAppend) {
            # leave the entry;
            print $0
            if (index_number > max_index)
               max_index = index_number
         }
         else {
            # do not, which will remove the entry;
            if (bVerbose)
               print "# removed:", $0 > ("/tmp/tmp_" pid)
         }
      }
      else if (match($0, /^dfc.docbroker.port\[[0-9]+\]=/)) {
         if (!port && !bFirst_port) {
            # save the first port to be used if command-line port was omitted;
            bFirst_port = 1
            first_port = substr($0, RLENGTH +1)
         }
         if (bAppend)
            # leave the entry;
            print $0
         else {
            # do nothing, which will remove the entry;
            if (bVerbose)
               print "# removed:", $0 > ("/tmp/tmp_" pid)
         }
      }
      else print
   }
   END {
      if (!hostname)
         hostname = first_hostname
      if (!port)
         port = first_port
      if (bAppend)
         index_number = max_index + 1
      else
         index_number = 0
      print "dfc.docbroker.host[" index_number "]=" hostname
      print "dfc.docbroker.port[" index_number "]=" port
      if (bVerbose) {
         print "# added: dfc.docbroker.host[" index_number "]=" hostname > ("/tmp/tmp_" pid)
         print "# added: dfc.docbroker.port[" index_number "]=" port     > ("/tmp/tmp_" pid)
      }
      close("/tmp/tmp_" pid)
   }' $dfc_config > ${dfc_config}_new

   if [[ $bVerbose -eq 1 ]]; then
      echo "requested changes:"
      cat /tmp/tmp_$$
      rm /tmp/tmp_$$
      echo "diffs:"
      diff $dfc_config ${dfc_config}_new
   fi 

   mv ${dfc_config}_new $dfc_config
   shift

   if [[ $bFordmawk -eq 1 ]]; then
      docbase="-v docbase=$docbase"
      [[ $passTarget2awk -eq 1 ]] && docbase="-v repo_target=$target $docbase"
   fi
   [[ $bVerbose -eq 1 ]] && echo "calling original: $DM_HOME/bin/${dctm_program} $docbase $*"
   $DM_HOME/bin/${dctm_program} $docbase $*

   # restore original config file;
   [[ $bKeep -eq 0 ]] && mv ${backup_file} $dfc_config
else
   if [[ $bVerbose -eq 1 ]]; then
      echo "no change to current $dfc_config file"
      echo "calling original: $DM_HOME/bin/${dctm_program} $*"
   fi
   $DM_HOME/bin/${dctm_program} $*
fi

The original configuration file is always saved on entry by appending a timestamp precise to the second which, unless you’re the Flash running the command twice in the background with the option ––keep but without ––append, should be enough to preserve the original content.
To make the command-line parsing simpler, the script relies on the final invoked command for checking any syntax errors. Feel free to modify it and make it more robust if you need that. As said earlier, the ––verbose option can help troubleshooting unexpected results here.
See part II of this article for the tests.

Cet article Connecting to a Repository via a Dynamically Edited dfc.properties File (part I) est apparu en premier sur Blog dbi services.

Connecting to a Repository via a Dynamically Edited dfc.properties File (part II)

$
0
0

This is part II of the 2-part article. See for part I of this article.

Testing

We will test on the host machine named docker that hosts 2 containers, container01 and container011. All 3 machines run a repository. Its name is respectively dmtest on docker (shortly, dmtest@docker:1489), dmtest01@container01:1489 (dmtest01@container01:2489 externally) and dmtest01@container011:1489 (dmtest01@container011:5489 externally). Incidentally, the enhanced syntax is also a good way to uniquely identify the repositories.
The current dfc.properties file on the host docker:

$ grep docbroker /app/dctm/config/dfc.properties
dfc.docbroker.host[0]=docker
dfc.docbroker.port[0]=1489

This is used for the local docbase dmtest.
Let’s tag all the docbases for an easy identification later:

$ iapi dmtest -Udmadmin -Pdmadmin <<eoq
retrieve,c,dm_docbase_config
set,c,l,title
dmtest on docker host VM
save,c,l
eoq

Idem from within container01 with its default dfc.properties file:

$ iapi dmtest01 -Udmadmin -Pdmadmin <<eoq
retrieve,c,dm_docbase_config
set,c,l,title
dmtest01 created silently on container01
save,c,l
eoq

Idem from within container011:

$ iapi dmtest01 -Udmadmin -Pdmadmin <<eoq
retrieve,c,dm_docbase_config
set,c,l,title
dmtest01 created silently on container011
save,c,l
eoq

First, let's access container01.dmtest01 from the containers' host VM with the current dfc.properties file:

$ idql dmtest01 -Udmadmin -Pdmadmin
 
 
OpenText Documentum idql - Interactive document query interface
Copyright (c) 2018. OpenText Corporation
All rights reserved.
Client Library Release 16.4.0070.0035
 
 
Connecting to Server using docbase dmtest01
Could not connect
[DM_DOCBROKER_E_NO_SERVERS_FOR_DOCBASE]error: "The DocBroker running on host (docker:1489) does not know of a server for the specified docbase (dmtest01)"

As expected, it does not work because container01.dmtest01 does not project to the host’s docbroker. Now, let’s turn to widql:

$ ./widql dmtest01@docker:2489 -Udmadmin -Pdmadmin --keep <<eoq
select title from dm_docbase_config
go
eoq
OpenText Documentum idql - Interactive document query interface
Copyright (c) 2018. OpenText Corporation
All rights reserved.
Client Library Release 16.4.0070.0035
 
 
Connecting to Server using docbase dmtest01
[DM_SESSION_I_SESSION_START]info: "Session 0100c350800011bb started for user dmadmin."
 
 
Connected to OpenText Documentum Server running Release 16.4.0000.0248 Linux64.Oracle
title
------------------------------------------
dmtest01 created silently on container01

It works.
We used ––keep, therefore the dfc.properties file has changed:

$ grep docbroker /app/dctm/config/dfc.properties
dfc.docbroker.host[0]=docker
dfc.docbroker.port[0]=2489

Indeed.
That docbase can also be reached by the container’s IP address and internal port 1489:

$ docker exec -it container01 ifconfig eth0 | head -3
eth0: flags=4163 mtu 1500
inet 192.168.33.101 netmask 255.255.255.0 broadcast 192.168.33.255
ether 02:42:c0:a8:21:65 txqueuelen 0 (Ethernet)
 
$ ./widql dmtest01@192.168.33.101:1489 -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest01
[DM_SESSION_I_SESSION_START]info: "Session 0100c350800011b5 started for user dmadmin."
...
title
------------------------------------------
dmtest01 created silently on container01

Is the local dmtest docbase still reachable ?:

$ idql dmtest -Udmadmin -Pdmadmin
...
Could not connect
[DM_DOCBROKER_E_NO_SERVERS_FOR_DOCBASE]error: "The DocBroker running on host (docker:2489) does not know of a server for the specified docbase (dmtest)"

Not with that changed dfc.properties file and the standard tools. But by using our nifty little tool:

$ ./widql dmtest@docker:1489 -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle
title
----------------------
dmtest on host VM

Fine !
Is container011.dmtest01 reachable now ?

$ ./widql dmtest01 -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest01
...
Connected to OpenText Documentum Server running Release 16.4.0000.0248 Linux64.Oracle
title
-------------------------------------------
dmtest01 created silently on container01

This is container01.dmtest01, not the one we want, i.e. the one on container011.
Note that ./widql was called without the extended syntax so it invoked the standard idql directly.
Let try again:

$ ./widql dmtest01@docker:5489 -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest01
[DM_SESSION_I_SESSION_START]info: "Session 0100c3508000059e started for user dmadmin."
...
title
------------------------------------------
dmtest01 created silently on container011

Here we go, it works !
The same using the container’s IP address and its docbroker’s internal port:

$ docker exec -it container011 ifconfig eth0 | head -3
eth0: flags=4163 mtu 1500
inet 192.168.33.104 netmask 255.255.255.0 broadcast 192.168.33.255
ether 02:42:c0:a8:21:68 txqueuelen 0 (Ethernet)
 
$ ./widql dmtest01@192.168.33.104:5489 -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest01
[DM_SESSION_I_SESSION_START]info: "Session 0100c35080000598 started for user dmadmin."
...
title
------------------------------------------
dmtest01 created silently on container011

Try now the same connection but with ––append and ––keep:

$ ./widql dmtest01@docker:5489 -Udmadmin -Pdmadmin --append --keep <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest01
...
Connected to OpenText Documentum Server running Release 16.4.0000.0248 Linux64.Oracle
title
-------------------------------------------
dmtest01 created silently on container011

What is the content of dfc.properties now ?

$ grep docbroker /app/dctm/config/dfc.properties
dfc.docbroker.host[0]=docker
dfc.docbroker.port[0]=2489
dfc.docbroker.host[1]=docker
dfc.docbroker.port[1]=5489

Both options have been taken into account as expected.
Let’s try to reach the VM host’s repository:

$ ./widql dmtest -Udmadmin -Pdmadmin <<eoq
select title from dm_docbase_config
go
eoq
...
Connecting to Server using docbase dmtest
Could not connect
[DM_DOCBROKER_E_NO_SERVERS_FOR_DOCBASE]error: "The DocBroker running on host (docker:2489) does not know of a server for the specified docbase (dmtest)"

Specify the docbroker’s host and the ––verbose option:

$ ./widql dmtest@docker -Udmadmin -Pdmadmin --verbose <<eoq
select title from dm_docbase_config
go
eoq
 
changing to docker:...
requested changes:
# removed: dfc.docbroker.host[0]=docker
# removed: dfc.docbroker.port[0]=2489
# removed: dfc.docbroker.host[1]=docker
# removed: dfc.docbroker.port[1]=5489
# added: dfc.docbroker.host[0]=docker
# added: dfc.docbroker.port[0]=2489
diffs:
12,13d11
< dfc.docbroker.host[1]=docker
< dfc.docbroker.port[1]=5489
calling original: /app/dctm/product/16.4/bin/idql dmtest -Udmadmin -Pdmadmin
...
Connecting to Server using docbase dmtest
Could not connect
[DM_DOCBROKER_E_NO_SERVERS_FOR_DOCBASE]error: "The DocBroker running on host (docker:2489) does not know of a server for the specified docbase (dmtest)"

Since the port was not specified, the wrapper took the first port found in the dfc.properties to supply the missing value, i.e. 2489 which is incorrect as dmtest@docker only projects to port docker:1489.
Use an unambiguous command now:

$ ./widql dmtest@docker:1489 -Udmadmin -Pdmadmin ––verbose <<eoq
select title from dm_docbase_config
go
eoq
 
changing to docker:1489...
requested changes:
# removed: dfc.docbroker.host[0]=docker
# removed: dfc.docbroker.port[0]=2489
# removed: dfc.docbroker.host[1]=docker
# removed: dfc.docbroker.port[1]=5489
# added: dfc.docbroker.host[0]=docker
# added: dfc.docbroker.port[0]=1489
diffs:
11,13c11
< dfc.docbroker.port[0]=2489
< dfc.docbroker.host[1]=docker
dfc.docbroker.port[0]=1489
calling original: /app/dctm/product/16.4/bin/idql dmtest -Udmadmin -Pdmadmin
...
Connecting to Server using docbase dmtest
...
Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle
title
--------------------
dmtest on host VM

Looks OK.
Let’s try wdmawk now. But first, here is the test code twdmawk.awk:

$ cat twdmawk.awk 
BEGIN {
   print "repo_target=" repo_target, "docbase=" docbase
   session = dmAPIGet("connect," docbase ",dmadmin,dmadmin")
   print dmAPIGet("getmessage," session)
   dmAPIGet("retrieve," session ",dm_docbase_config")
   print dmAPIGet("get," session ",l,title")
   dmAPIExec("disconnect," session)
   exit(0)
}

Line 3 displays the two variables automatically passed to dmawk by the wrapper, repo_target and docbase.
The test script connects to the docbase which was silently passed as command-line parameter by wdmawk through the -v option after it extracted it from the given target parameter docbase[@host[:port]], as illustrated below with the ––verbose option.
Let’s see the invocation for the repository on the host VM:

$ ./wdmawk dmtest@docker:1489 -f ./twdmawk.awk --verbose
changing to docker:1489...
requested changes:
# removed: dfc.docbroker.host[0]=docker
# removed: dfc.docbroker.port[0]=2489
# removed: dfc.docbroker.host[1]=docker
# removed: dfc.docbroker.port[1]=5489
# added: dfc.docbroker.host[0]=docker
# added: dfc.docbroker.port[0]=1489
diffs:
11,13c11
< dfc.docbroker.port[0]=2489
< dfc.docbroker.host[1]=docker
––
> dfc.docbroker.port[0]=1489
calling original: /app/dctm/product/16.4/bin/dmawk -v docbase=dmtest -f ./twdmawk.awk
repo_target= docbase=dmtest
[DM_SESSION_I_SESSION_START]info: "Session 0100c3508000367b started for user dmadmin."
 
 
dmtest on host VM

Let’s acces the container01’s repository :

$ ./wdmawk dmtest01@docker:2489 -f ./twdmawk.awk
 
[DM_SESSION_I_SESSION_START]info: "Session 0100c35080001202 started for user dmadmin."
 
 
dmtest01 created silently on container01

A small typo in the port number and …

dmadmin@docker:~$ ./wdmawk dmtest01@docker:3489 -f ./twdmawk.awk
 
[DFC_DOCBROKER_REQUEST_FAILED] Request to Docbroker "docker:3489" failed
 
[DM_SESSION_E_RPC_ERROR]error: "Server communication failure"
 
java.net.ConnectException: Connection refused (Connection refused)

Note the stupid error message “… Connection refused …”, very misleading when investigating a problem. It’s just that there nobody listening on that port.
Let’s access the container011’s repository:

dmadmin@docker:~$ ./wdmawk dmtest01@docker:5489 -f ./twdmawk.awk
 
[DM_SESSION_I_SESSION_START]info: "Session 0100c350800005ef started for user dmadmin."
 
 
dmtest01 created silently on container011

Effect of the -v option:

dmadmin@docker:~$ ./wdmawk -v dmtest01@docker:5489 -f ./twdmawk.awk --verbose
...
calling original: /app/dctm/product/16.4/bin/dmawk -v repo_target=dmtest@docker:1489 -v docbase=dmtest -f ./twdmawk.awk
repo_target=dmtest@docker:1489 docbase=dmtest
[DM_SESSION_I_SESSION_START]info: "Session 0100c35080003684 started for user dmadmin."
 
 
dmtest on host VM

A repo_target parameter with the extended syntax has been passed to dmawk.
Let’s now quickly check the wrapper from within the containers.
Container01
The host’s docbase:

[dmadmin@container01 scripts]$ ./wiapi dmtest@docker:1489 -Udmadmin -Pdmadmin<<eoq
retrieve,c,dm_docbase_config
get,c,l,title
eoq
...
Connecting to Server using docbase dmtest
...
dmtest on host VM

The container011’s docbase:

[dmadmin@container01 scripts]$ ./wiapi dmtest01@container011:1489 -Udmadmin -Pdmadmin<<eoq
retrieve,c,dm_docbase_config
get,c,l,title
eoq
...
Connecting to Server using docbase dmtest01
...
dmtest01 created silently on container011
...

Container011
The host’s docbase:

dmadmin@container011 scripts]$ ./wiapi dmtest@docker:1489 -Udmadmin -Pdmadmin<<eoq
retrieve,c,dm_docbase_config
get,c,l,title
eoq
...
Connecting to Server using docbase dmtest
...
Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle
...
dmtest on host VM
...

The docbase on container01:

dmadmin@container011 scripts]$ ./wiapi dmtest01@container01:1489 -Udmadmin -Pdmadmin<<eoq
retrieve,c,dm_docbase_config
get,c,l,title
eoq
...
...
Connecting to Server using docbase dmtest01
...
dmtest01 created silently on container01
...

Let’s briefly test the usage of the sourced configuration file. Here is a snippet of the file shown earlier in this article:

# repository connection configuration file;
# must be sourced prior so the environment variables can be resolved;
# this is a enhancement over the dfc.properties file syntax used by the dctm_wrapper utility:
# docbroker.host[i]=...
# docbroker.port[i]=...
# it supports several syntaxes:
# docbroker only definition [[docbroker_host]:[port]];
#    usage: ./widql dmtest@$dmtest
# full definition docbase[@[docbroker_host]:[port]]
#    usage: ./widql $test
# alternate ':' separator docbase:[[docbroker_host]:[docroker_port]];
#    usage: ./widql $dmtestVM
# alias literal;
#    usage: ./widql test
# in order to resolve alias literals, the wrapper will source the configuration file by itself;
...
# container011.dmtest01;
# docbroker only definition docbroker_host:port;
d_dmtest011=container011:5489
di_dmtest011=192.168.33.104:1489
# full definition;
f_dmtest011=dmtest01@container011:2489
fip_dmtest011=dmtest01@192.168.33.104:1489

With a good name convention, the variables can be easily remembered which saves a lot of typing too.
Note on lines 9 and 10 how the whole extended target name can be specified, including the repository name.
A few tests:

dmadmin@docker:~$ ./widql dmtest01@$d_dmtest011 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql dmtest01@container011:5489 -Udmadmin -Pdmadmin --verbose] ...
 
dmadmin@docker:~$ ./widql dmtest01@$dip_dmtest011 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql dmtest01@192.168.33.104:1489 -Udmadmin -Pdmadmin --verbose] ...
 
dmadmin@docker:~$ ./widql $f_dmtest011 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql dmtest01@container011:2489 -Udmadmin -Pdmadmin --verbose] ...
 
dmadmin@docker:~$ ./widql $fip_dmtest011 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql dmtest01@192.168.33.104:1489 -Udmadmin -Pdmadmin --verbose] ...

The variables have been expanded by the shell prior to entering the wrapper, no programming effort was needed here, which is always appreciated.

Possible Enhancements

As shown precedently, the alternate configuration file lists aliases for the couples docbroker:port and even repository@docbroker:port. In passing, the wrapper also supports the version repository:docbroker:port.
Now, in order to better match Documentum syntax, is it possible to be even more transparent by removing dollar signs, colons and at-signs while still accessing the extended syntax ? E.g.:

$ ./widql dmtest -Udmadmin ....

Yes it is. The trick here is to first look up the alias in the configuration file, which incidentally becomes mandatory now, and re-execute the program with the alias resolved. As we are all lazy coders, we will not explicitly code the looking up but instead rely on the shell: the wrapper will source the file, resolve the target and re-execute itself.
If the alias has not been defined in the file, then the wrapper considers it as the name of a repository and falls back to the usual command-line tools.
A good thing is that no new format has to be introduced in the file as the target is still the name of an environment variable.
Since the changes are really minimal, let’s do it. Hereafter, the diff output showing the changes from the listing in part I:

> # this variable points to the target repositories alias file and defaults to repository_connections.aliases;
> REPO_ALIAS=${REPO_ALIAS:-~/repository_connections.aliases}
> 
107a111
> [[ bVerbose -eq 1 ]] && echo "current configuration file=[$REPO_ALIAS]"
225,227c229,241
<    if [[ $bVerbose -eq 1 ]]; then
<       echo "no change to current $dfc_config file"
    [[ -f $REPO_ALIAS ]] && . $REPO_ALIAS
>    definition=${!1}
>    [[ $bVerbose -eq 1 ]] && echo "alias lookup in $REPO_ALIAS: $1 = $definition"
>    if [[ ! -z $definition ]]; then
>       new_cmd=${current_cmd/$1/$definition}
>       [[ $bVerbose -eq 1 ]] && echo "invoking $new_cmd"
>       exec $new_cmd
>    else
>       if [[ $bVerbose -eq 1 ]]; then
>          echo "no change to current $dfc_config file"
>          echo "calling original: $DM_HOME/bin/${dctm_program} $*"
>       fi
>       $DM_HOME/bin/${dctm_program} $*
229d242
<    $DM_HOME/bin/${dctm_program} $*

On line 9, the target configuration file pointed to by the REPO_ALIAS environment variable gets sourced if existing. $REPO_ALIAS defaults to repository_connections.aliases but can be changed before calling the wrapper.
Note on line 10 how bash can dereference a variable 1 containing the name of another variable 2 to get variable 2’s value (indirect expansion), nice touch.
To apply the patch in-place, save the diffs above in diff-file and run the following command:

patch old-file < diff-file

Testing
For conciseness, the tests below only show how the target is resolved. The actual connection has already been tested abundantly earlier.

dmadmin@docker:~$ ./widql f_dmtest -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql f_dmtest -Udmadmin -Pdmadmin --verbose] alias lookup in /home/dmadmin/repository_connections.aliases: f_dmtest = dmtest@docker:1489
invoking ./widql dmtest@docker:1489 -Udmadmin -Pdmadmin --verbose
current_cmd=[/home/dmadmin/widql dmtest@docker:1489 -Udmadmin -Pdmadmin --verbose] ...
dmadmin@docker:~$ ./widql fip_dmtest01 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql fip_dmtest01 -Udmadmin -Pdmadmin --verbose] alias lookup in /home/dmadmin/repository_connections.aliases: fip_dmtest01 = dmtest01@192.168.33.2:1489
invoking ./widql dmtest01@192.168.33.2:1489 -Udmadmin -Pdmadmin --verbose
current_cmd=[/home/dmadmin/widql dmtest01@192.168.33.2:1489 -Udmadmin -Pdmadmin --verbose] ...
dmadmin@docker:~$ ./widql fip_dmtest011 -Udmadmin -Pdmadmin --verbose
current_cmd=[./widql fip_dmtest011 -Udmadmin -Pdmadmin --verbose] alias lookup in /home/dmadmin/repository_connections.aliases: fip_dmtest011 = dmtest01@192.168.33.3:1489
invoking ./widql dmtest01@192.168.33.3:1489 -Udmadmin -Pdmadmin --verbose
current_cmd=[/home/dmadmin/widql dmtest01@192.168.33.3:1489 -Udmadmin -Pdmadmin --verbose]

Note how the targets are cleaner now, no curly little fancy shell characters in front.

Conclusion

As I was testing this little utility, I was surprised to realize how confortable and natural its usage is. It feels actually better to add the docbroker’s host and port than to stop at the docbase name, probably because it makes the intented repository absolutely unambiguous. The good thing is that is almost invisible, except for its invocation but even this can be smoothed out by using command aliases or renaming the symlinks.
When one has to work with identically named docbases or with clones existing in different environments, dctm-wrapper brings a real relief. And it was quick and easy to put together too.
As it modifies an essential configuration file, it is mainly aimed at developers or administrators on their machine, but then those constitute the targeted audience anyway.
As always, if you have any ideas for some utility that could benefit us all, please do no hesitate to suggest them in the comment section. Feedback is welcome too of course.

Cet article Connecting to a Repository via a Dynamically Edited dfc.properties File (part II) est apparu en premier sur Blog dbi services.

Documentum – D2+Pack Plugins not installed correctly

$
0
0

In a previous blog, I explained how D2 can be installed in silent. In this blog, I will talk about a possible issue that might happen when doing so with the D2+Pack Plugins that aren’t being installed, even if you ask D2 to install them and while there is no message or no errors related to this issue. The first time I had this issue, it was several years ago but I never blogged about it. I faced it again recently so I thought I would this time.

So first, let’s prepare the D2 and D2+Pack packages for the silent installation. I will take the D2_template.xml file from my previous blog as a starting point for the silent parameter file:

[dmadmin@cs_01 ~]$ cd $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ ls *.zip *.tar.gz
-rw-r-----. 1 dmadmin dmadmin 491128907 Jun 16 08:12 D2_4.7.0_P25.zip
-rw-r-----. 1 dmadmin dmadmin  61035679 Jun 16 08:12 D2_pluspack_4.7.0.P25.zip
-rw-r-----. 1 dmadmin dmadmin 122461951 Jun 16 08:12 emc-dfs-sdk-7.3.tar.gz
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_4.7.0_P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ tar -xzvf $DOCUMENTUM/D2-Install/emc-dfs-sdk-7.3.tar.gz -C $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ #See the previous blog for the content of the "/tmp/dctm_install/D2_template.xml" file
[dmadmin@cs_01 D2-Install]$ export d2_install_file=$DOCUMENTUM/D2-Install/D2.xml
[dmadmin@cs_01 D2-Install]$ cp /tmp/dctm_install/D2_template.xml ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###WAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###BPM_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DOCUMENTUM###,$DOCUMENTUM," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###PLUGIN_LIST###,$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar;," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###JMS_HOME###,$DOCUMENTUM_SHARED/wildfly9.0.1," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DFS_SDK_PACKAGE###,emc-dfs-sdk-7.3," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ read -s -p "  ----> Please enter the Install Owner's password: " dm_pw; echo; echo
  ----> Please enter the Install Owner's password: <TYPE HERE THE PASSWORD>
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER###,dmadmin," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER_PASSWD###,${dm_pw}," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s/###DOCBASE_LIST###/Docbase1/" ${d2_install_file}
[dmadmin@cs_01 D2-Install]$

 

Now that the silent file is ready and that all source packages are available, we can start the D2 Installation with the command below. Please note the usage of the tracing/debugging options as well as the usage of the “-Djava.io.tmpdir” Java option to ask D2 to put all tmp files in a specific directory, with this, D2 is supposed to trace/debug everything and use my specific temporary folder:

[dmadmin@cs_01 D2-Install]$ java -DTRACE=true -DDEBUG=true -Djava.io.tmpdir=$DOCUMENTUM/D2-Install/tmp -jar $DOCUMENTUM/D2-Install/D2_4.7.0_P25/D2-Installer-4.7.0.jar ${d2_install_file}

 

The D2 Installer printed the following extract:

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see, there are no errors so if you aren’t paying attention, you might think that the D2+Pack is properly installed. It’s not. At the end of the extract I put above, you can see that the D2 Installer is updating the plugins list with some elements (D2-Widget-Plugin.jar & D2-Specifications-Plugin.jar). If there were no issue, the D2+Pack Plugins would have been added in this section as well, which isn’t the case.

You can check all temporary files, all log files, it will not be printed anywhere that there were an issue while installing the D2+Pack Plugins. In fact, there are 3 things missing:

  • The DARs of the D2+Pack Plugins weren’t installed
  • The libraries of the D2+Pack Plugins weren’t deployed into the JMS
  • The libraries of the D2+Pack Plugins weren’t packaged in the WAR files

There is a way to quickly check if the D2+Pack Plugins DARs have been installed, just look inside the docbase config folder, there should be one log file for the D2 Core DARs as well as one log file for each of the D2+Pack Plugins. So that’s what you should get:

[dmadmin@cs_01 D2-Install]$ cd $DOCUMENTUM/dba/config/Docbase1/
[dmadmin@cs_01 Docbase1]$ ls -ltr *.log
-rw-r-----. 1 dmadmin dmadmin  62787 Jun 16 08:18 D2_CORE_DAR.log
-rw-r-----. 1 dmadmin dmadmin   4794 Jun 16 08:20 D2-C2_dar.log
-rw-r-----. 1 dmadmin dmadmin   3105 Jun 16 08:22 D2-Bin_dar.log
-rw-r-----. 1 dmadmin dmadmin   2262 Jun 16 08:24 D2-O2_DAR.log
[dmadmin@cs_01 Docbase1]$

 

If you only have “D2_CORE_DAR.log”, then you are potentially facing this issue. You could also check the “csDir” folder that you put in the D2 silent parameter file: if this folder doesn’t contain “O2-API.jar” or “C2-API.jar” or “D2-Bin-API.jar”, then you have the issue as well. Obviously, you could also check the list of installed DARs in the repository…

So what’s the issue? Well, you remember above when I mentioned the “-Djava.io.tmpdir” Java option to specifically ask D2 to put all temporary files under a certain location? The D2 Installer, for the D2 part, is using this option without issue… But for the D2+Pack installation, there is actually a hardcoded path for the temporary files which is /tmp. Therefore, it will ignore this Java option and will try instead to execute the installation under /tmp.

This is the issue I faced a few times already and it’s the one I wanted to talk about in this blog. For security reasons, you might have to deal from time to time with specific mount options on file systems. In this case, the “noexec” option was set on the /tmp mount point and therefore D2 wasn’t able to execute commands under /tmp and instead of printing an error, it just bypassed silently the installation. I had a SR opened with the Documentum Support (when it was still EMC) to see if it was possible to use the Java option and not /tmp but it looks like it’s still not solved since I had the exact same issue with the D2 4.7 P25 which was released very recently.

Since there is apparently no way to specify which temporary folder should be used for the D2+Pack Plugins, you should either perform the installation manually (DAR installation + libraries in JMS & WAR files) or remove the “noexec” option on the file system for the time of the installation:

[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount,exec /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ #Execute the D2 Installer here
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)

 

With the workaround in place, the D2 Installer should now print the following (same extract as above):

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: C2-Plugin.jar and config exclude value of: false
Updating plugin 3 with plugin name: O2-Plugin.jar and config exclude value of: false
Updating plugin 4 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Updating plugin 5 with plugin name: D2-Bin-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see above, the output is quite different: it means that the D2+Pack Plugins have been installed.

 

Cet article Documentum – D2+Pack Plugins not installed correctly est apparu en premier sur Blog dbi services.

Alfresco – ActiveMQ basic setup

$
0
0

Apache ActiveMQ

ActiveMQ is an open source Java Messaging Server (JMS) from the Apache Software Foundation that supports a lot of protocols. In Alfresco 5, ActiveMQ has been introduced as a new, optional, component in the stack. It was, at the beginning, only used for “side” features like Alfresco Analytics or Alfresco Media Management in the early Alfresco 5.0. In Alfresco 6.0, ActiveMQ was still used for Alfresco Media Management but also for the Alfresco Sync Service. It’s only starting with the Alfresco 6.1, released last February, that ActiveMQ became a required component, used for the same things but also now used for transformations.

The Alfresco documentation doesn’t really describe how to install ActiveMQ or how to configure it, it just explains how to connect Alfresco to it. Therefore, I thought I would write a small blog about how to do a basic installation of ActiveMQ for a usage in Alfresco.

Alfresco 6.1 supports ActiveMQ v5.15.6 so that’s the one I will be using for this blog as example.

First let’s start with defining some environment variables that will be used to know where to put ActiveMQ binaries and data:

[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_HOME=/opt/activemq" >> ~/.profile
[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_DATA=\$ACTIVEMQ_HOME/data" >> ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep "ACTIVEMQ" ~/.profile
export ACTIVEMQ_HOME=/opt/activemq
export ACTIVEMQ_DATA=$ACTIVEMQ_HOME/data
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ source ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_DATA
/opt/activemq/data
[alfresco@mq_n1 ~]$

 

I’m usually using symlinks for all the components so that I can keep a generic path in case of upgrades, aso… So, let’s download the software and put all that where it should:

[alfresco@mq_n1 ~]$ activemq_version="5.15.6"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ wget http://archive.apache.org/dist/activemq/${activemq_version}/apache-activemq-${activemq_version}-bin.tar.gz
--2019-07-25 16:55:23--  http://archive.apache.org/dist/activemq/5.15.6/apache-activemq-5.15.6-bin.tar.gz
Resolving archive.apache.org... 163.172.17.199
Connecting to archive.apache.org|163.172.17.199|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58556801 (56M) [application/x-gzip]
Saving to: ‘apache-activemq-5.15.6-bin.tar.gz’

apache-activemq-5.15.6-bin.tar.gz     100%[=======================================================================>]  55.84M  1.62MB/s    in 35s

2019-07-25 16:55:58 (1.60 MB/s) - ‘apache-activemq-5.15.6-bin.tar.gz’ saved [58556801/58556801]

[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ tar -xzf apache-activemq-${activemq_version}-bin.tar.gz
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_HOME-${activemq_version}
[alfresco@mq_n1 ~]$ ln -s $ACTIVEMQ_HOME-${activemq_version} $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ ls -l $ACTIVEMQ_HOME/.. | grep -i activemq
lrwxr-xr-x   1 alfresco  alfresco        31 Jul 25 17:04 activemq -> /opt/activemq-5.15.6
drwxr-xr-x   2 alfresco  alfresco        64 Jul 25 17:03 activemq-5.15.6
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ rm -rf ./apache-activemq-${activemq_version}/data
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_DATA
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ mv apache-activemq-${activemq_version}/* $ACTIVEMQ_HOME/

 

Once that is done and before starting ActiveMQ for the first time, there are still some configurations to be done. It is technically possible to add a specific authentication for communications between Alfresco and ActiveMQ or setup the communications in SSL for example. It depends on the usage you will have for the ActiveMQ but as a minimal configuration for use with Alfresco, I believe that the default users (“guest” to access docbroker & “user” to access web console) should at least be removed and the admin password changed:

[alfresco@mq_n1 ~]$ activemq_admin_pwd="Act1v3MQ_pwd"
[alfresco@mq_n1 ~]$ activemq_broker_name="`hostname -s`"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "user" from the web console
[alfresco@mq_n1 ~]$ sed -i "/^user:[[:space:]]*.*/d" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "guest" from the broker
[alfresco@mq_n1 ~]$ sed -i "/^guest.*/d" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Change admin password
[alfresco@mq_n1 ~]$ sed -i "s/^admin=.*/admin=${activemq_admin_pwd}\n/" $ACTIVEMQ_HOME/conf/users.properties
[alfresco@mq_n1 ~]$ sed -i "s/^admin.*/admin: ${activemq_admin_pwd}, admin/" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.username=.*/activemq.username=admin/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.password=.*/activemq.password=${activemq_admin_pwd}/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="localhost" dataDirectory="${activemq.data}">
                <storeUsage limit="100 gb"/>
                <tempUsage limit="50 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Set broker name & allowed usage
[alfresco@mq_n1 ~]$ sed -i "s/brokerName=\"[^"]*\"/brokerName=\"${activemq_broker_name}\"/" $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,storeUsage limit="[^"]*",storeUsage limit="10 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,tempUsage limit="[^"]*",tempUsage limit="5 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="mq_n1" dataDirectory="${activemq.data}">
                    <storeUsage limit="10 gb"/>
                    <tempUsage limit="5 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_DATA

 

So above, I set a specific name for the broker, that’s mainly if you expect to see at some points several brokers, to differentiate them. I also change the default storeUsage and tempUsage, that’s mainly to show how it’s done because these two parameters define the limit that ActiveMQ will be able to use on the file system. I believe the default is way too much for ActiveMQ’s usage in Alfresco, so I always reduce these or use a percentage as value (percentLimit).

With the default configuration, ActiveMQ uses “${activemq.data}” for the data directory which is actually using the “$ACTIVEMQ_DATA” environment variable, if present (otherwise it sets it as $ACTIVEMQ_HOME/data). That’s the reason why I set this environment variable, so it is possible to define a different data folder without having to change the default configuration. This data folder will mainly contain the logs of ActiveMQ, the PID file and the KahaDB for the persistence adapter.

Finally creating a service for ActiveMQ and starting it is pretty easy as well:

[alfresco@mq_n1 ~]$ cat > activemq.service << EOF
[Unit]
Description=ActiveMQ service

[Service]
Type=forking
ExecStart=###ACTIVEMQ_HOME###/bin/activemq start
ExecStop=###ACTIVEMQ_HOME###/bin/activemq stop
Restart=always
User=alfresco
WorkingDirectory=###ACTIVEMQ_DATA###
LimitNOFILE=8192:65536

[Install]
WantedBy=multi-user.target
EOF
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_HOME###,${ACTIVEMQ_HOME}," activemq.service
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_DATA###,${ACTIVEMQ_DATA}," activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo cp activemq.service /etc/systemd/system/
[alfresco@mq_n1 ~]$ rm activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl enable activemq.service
[alfresco@mq_n1 ~]$ sudo systemctl daemon-reload
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl start activemq.service

 

Once ActiveMQ is setup as you want, for the registration in Alfresco, it’s very easy:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### ActiveMQ
messaging.broker.url=failover:(tcp://mq_n1.domain:61616)?timeout=3000&randomize=false&daemon=false&dynamicManagement=false&trace=false
#messaging.username=
#messaging.password=
...
[alfresco@alf_n1 ~]$

 

As mentioned at the beginning of this blog, ActiveMQ supports a lot of protocols so you can use pretty much what you want: TCP, NIO, SSL, NIO SSL, Peer (2 Peer), UDP, Multicast, HTTP, HTTPS, aso… You can find all the details for that here.

To add authentication between Alfresco and ActiveMQ, you will need to enable the properties in the alfresco-global.properties (the two that I commented above) and define the appropriate authentication in the ActiveMQ broker configuration. There is an example in the Alfresco documentation.

 

Cet article Alfresco – ActiveMQ basic setup est apparu en premier sur Blog dbi services.

Alfresco Clustering – Basis & Architectures

$
0
0

This blog will be the first of a series on Alfresco HA/Clustering topics. It’s been too long I haven’t posted anything related to Alfresco so I thought about writing a few blogs about my experience with setting up more or less complex HA/Clustering infrastructures. So, let’s start this first part with an introduction to the Alfresco HA/Clustering.

If you want to setup a HA/Cluster environment, you will have to first think about where you want to go exactly. Alfresco is composed of several components so “what do you want to achieve exactly?”, that would probably be the first question to ask.

Alfresco offers a lot of possibilities, you can more or less do whatever you want. That’s really great, but it also means that you should plan what you want to do first. Do you just want a simple HA architecture for Share+Repository but you can live without Solr for a few minutes/hours (in case of issues) or you absolutely want all components to be always available? Or maybe you want an HA architecture which is better suited for high throughput? Obviously, there might be some costs details that need to be taken into consideration linked to the resources but also the licenses: the Alfresco Clustering license itself but also the Index Engine license if you go for separated Solr Servers.

That’s what you need to define first to avoid losing time changing configurations and adding more components into the picture later. Alternatively (and that’s something I will try to cover as much as I can), it’s also possible to setup an environment which will allow you to add more components (at least some of them…) as needed without having to change your HA/Clustering configuration, if you are doing it right from the start and if you don’t change too much the architecture itself.

I mentioned earlier the components of Alfresco (Alfresco Content Services, not the company), these are the ones we are usually talking about:

  • *Front-end (Apache HTTPD, Nginx, …)
  • *ActiveMQ
  • Alfresco PDF Renderer
  • Database
  • File System
  • ImageMagick
  • Java
  • LibreOffice
  • *Share (Tomcat)
  • *Repository (Tomcat)
  • *Solr6 (Jetty)

 

In this series of blog, I won’t talk about the Alfresco PDF Renderer, ImageMagick & Java because these are just simple binaries/executables that need to be available from the Repository side. For LibreOffice, it’s usually Alfresco that is managing it directly (multi-processes, restart if crash, aso…). It wouldn’t really make sense to talk about these in blogs related to Clustering. I will also disregard the Database and File System ones since they are usually out of my scope. The Database is usually installed & managed by my colleagues which are DBAs, they are much better at that than me. That leaves us with all components with an asterisk (*). I will update this list with links to the different blogs.

Before jumping in the first component, which will be the subject of the next blog, I wanted to go through some possible architectures for Alfresco. There are a lot of schemas available on internet but it’s often the same architecture that is presented so I thought I would take some time to represent, in my own way, what the Alfresco’s architecture could look like.

In the below schemas, I represented the main components: Front-end, Share, Repository, Solr, Database & File System (Data) as little boxes. As mentioned previously, I won’t talk about the Database & File System so I just represented them once to see the communications with these but what is behind their boxes can be anything (with HA/Clustering or not). The arrows represent the way communications are initiated: an arrow in a single direction “A->B” means that B is never initiating a communication with A. Boxes that are glued together represent all components installed on the same host (a physical server, a VM, a container or whatever).

 

Alfresco Architecture 1N°1: This is the simplest architecture for Alfresco. As you can see, it’s not a HA/Clustering architecture but I decided to start small. I added a Front-end (even if it’s not mandatory) because it’s a best practice and I would not install Alfresco without it. Nothing specific to say on this architecture, it’s just simple.

 

Alfresco Architecture 2N°2: The first thing to do if you have the simplest architecture in place (N°1) and you start seeing some resources contention is to split the components and more specifically to install Solr separately. This should really be the minimal architecture to use, whenever possible.

 

Alfresco Architecture 3N°3: This is the first HA/Clustering architecture. It starts small as you can see with just two nodes for each Front-end/Share/Repository stack with a Load Balancer to dispatch the load on each side for an Active/Active solution. The dotted grey lines represent the Clustering communications. In this architecture, there is therefore a Clustering for Share and another one for the Repository layer. The Front-end doesn’t need Clustering since it just forwards the communications but the session itself is on the Tomcat (Share/Repository) side. There is only one Solr node and therefore both Repository boxes will communicate with the Solr node (through the Front-end or not). Between the Repository and Solr, there is one bidirectional arrow and another one unidirectional. That’s because both Repository boxes will initiate searches but the Solr will do tracking to index new content with only one Repository: this isn’t optimal.

 

Alfresco Architecture 4N°4: To solve this small issue with Solr tracking, we can add a second Load Balancer in between so that the Solr tracking can target any Repository node. The first bottleneck you will encounter in Alfresco is usually the Repository because a lot of things are happening in the background at that layer. Therefore, this architecture is usually the simplest HA/Clustering solution that you will want to setup.

 

Alfresco Architecture 5N°5: If you are facing some performance issues with Solr or if you want all components to be in HA, then you will have to duplicate the Solr as well. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.

 

Alfresco Architecture 6N°6: As mentioned previously (for the N°4), the Repository is usually the bottleneck. To reduce the load on this layer, it is possible to do several things. The first possibility is to install another Repository and dedicate it to the Solr Tracking. As you can see above, the communications aren’t bidirectional anymore but only unidirectional. Searches will come from the two Repository that are in Cluster and Solr Tracking will use the separated/dedicated Repository. This third Repository can then be set in read-only, the jobs and services can be disabled, the Clustering can be disabled as well (so it uses the same DB but it’s not part of the Clustering communications because it doesn’t have to), aso… I put this third Repository as a standalone box but obviously you can install it with one of the two Solr nodes.

 

Alfresco Architecture 7N°7: The next step can be to add another read-only Repository and put these two nodes side by side with the Solr nodes. This is to only have localhost communications for the Solr Tracking which is therefore a little bit easier to secure.

 

Alfresco Architecture 8N°8: The previous architectures (N°6 & N°7) introduced a new single point of failure so to fix this, there is only one way: add a new Load Balancer between the Solr and the Repository for the tracking. Behind the Load Balancer, there are two solutions: keep the fourth Repository which is also in read-only or use a fallback to the Repository node1/node2 in case the read-only Repository (node3) isn’t available. For that purpose, the Load Balancer should be in, respectively, Active/Active or Active/Passive. As you can see, I choose to represent the first one.

 

These were a few possible architectures. You can obviously add more nodes if you want to, to handle more load. There are many other solutions so have fun designing the best one, according to your requirements.

 

Cet article Alfresco Clustering – Basis & Architectures est apparu en premier sur Blog dbi services.

Alfresco Clustering – Repository

$
0
0

In a previous blog, I talked about some basis and presented some possible architectures for Alfresco. Now that this introduction has been done, let’s dig into the real blogs about how to setup a HA/Clustering Alfresco environment. In this blog in particular, I will talk about the Repository layer.

For the Repository Clustering, there are three prerequisites (and that’s all you need):

  • A valid license which include the Repository Clustering
  • A shared file system which is accessible from all Alfresco nodes in the Cluster. This is usually a NAS accessed via NFS
  • A shared database

 

Clustering the Repository part is really simple to do: you just need to put the correct properties in the alfresco-global.properties file. Of course, you could also manage it all from the Alfresco Admin Console but that’s not recommended, you should really always use the alfresco-global.properties by default. The Alfresco Repository Clustering is using Hazelcast. It was using JGroups and EHCache as well before Alfresco 4.2 but now it’s just Hazelcast. So to define an Alfresco Cluster, simply put the following configuration in the alfresco-global.properties of the Alfresco node1:

[alfresco@alf_n1 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.10
[alfresco@alf_n1 ~]$
[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node1
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n1 ~]$

 

And for the Alfresco node2, you can use the same content:

[alfresco@alf_n2 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.11
[alfresco@alf_n2 ~]$
[alfresco@alf_n2 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node2
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n2 ~]$

 

Description of the Clustering parameters:

  • alfresco.cluster.enabled: Whether or not you want to enable the Repository Clustering for the local Alfresco node. The default value is false. You will want to set that to true for all Repository nodes that will be used by Share or any other client. If the Repository is only used for Solr Tracking, you can leave that to false
  • alfresco.cluster.interface: This is the network interface on which Hazelcast will listen for Clustering messages. This has to be an IP, it can’t be a hostname. To keep things simple and to have the same alfresco-global.properties on all Alfresco nodes however, it is possible to use a specific nomenclature:
    • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
    • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 4 IPs assigned to the local host and you don’t want Hazelcast to use 2 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the alfresco-global.properties on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
    • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above
  • alfresco.cluster.nodetype: A human-friendly string to represent the local Alfresco node. It doesn’t have any use for Alfresco, that’s really more for you. It is for example interesting to put a specific string for Alfresco node that won’t take part in the Clustering but that are still using the same Content Store and Database (like a Repository dedicated for the Solr Tracking, as mentioned above)
  • alfresco.hazelcast.password: The password to use for the Alfresco Repository Cluster. You need to use the same password for all members of the same Cluster. You should as well try to use a different password for each Cluster that you might have if they are in the same network (DEV/TEST/PROD for example), otherwise it will get ugly
  • alfresco.hazelcast.port: The default port that will be used for Clustering messages between the different members of the Cluster
  • alfresco.hazelcast.autoinc.port: Whether or not you want to allow Hazelcast to find another free port in case the default port (“alfresco.hazelcast.port”) is currently used. It will increment the port by 1 each time. You should really set this to false and just use the default port, to have full control over the channels that Clustering communications are using otherwise it might get messy as well
  • alfresco.hazelcast.max.no.heartbeat.seconds: The maximum time in seconds allowed between two heartbeat. If there is no heartbeat in this period of time, Alfresco will assume the remote node isn’t running/available

 

As you can see above, it’s really simple to add Clustering to an Alfresco Repository. Since you can(should?) have the same set of properties (except the nodetype string maybe), then it also really simplifies the deployment… If you are familiar with other Document Management System like Documentum for example, then you understand the complexity of some of these solutions! If you compare that to Alfresco, it’s like walking on the street versus walking on the moon where you obviously first need to go to the moon… Anyway, once it’s done, the logs of the Alfresco Repository node1 will display something like that when you start it:

2019-07-20 15:14:25,401  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Cluster started, name: MainRepository-<generated_id>
2019-07-20 15:14:25,405  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)

 

Wait for the Repository node1 to be fully started and once done, you can start the Repository node2, it needs to be started sequentially normally. You will see on the logs of the Repository node1 that another node joined automatically the Cluster:

2019-07-20 15:15:06,528  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Member joined: 10.10.10.11:5701 (hostname: alf_n2)
2019-07-20 15:15:06,529  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)
  10.10.10.11:5701 (hostname: alf_n2)

 

On the logs of the Repository node2, you can see directly at the initialization of the Hazelcast Cluster that the two nodes are available.

If you don’t want to check the logs, you can see pretty much the same thing from the Alfresco Admin Console. By accessing “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering“, you can see currently available cluster members (online nodes), non-available cluster members (offline nodes) as well as connected non-cluster members (nodes using the same DB & Content Store but with “alfresco.cluster.enabled=false”, for example to dedicate a Repository to Solr Tracking).

Alfresco also provides a small utility to check the health of the cluster which will basically ensure that the communication between each member is successful. This utility can be accessed at “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering-test“. It is useful to include a quick check using this utility in a monitoring solution for example, to ensure that the cluster is healthy.

 

Cet article Alfresco Clustering – Repository est apparu en premier sur Blog dbi services.


Alfresco Clustering – Share

$
0
0

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco and I talked about the Clustering setup for the Alfresco Repository. In this one, I will work on the Alfresco Share layer. Therefore, if you are using another client like a CMIS/REST client or an ADF Application, it won’t work that way, but you might or might not need Clustering at that layer, it depends how the Application is working.

The Alfresco Share Clustering is used only for the caches, so you could technically have multiple Share nodes working with a single Repository or a Repository Cluster without the Share Clustering. For that, you could disable the caches on the Share layer because if you kept it enabled, you would have, eventually, faced issues. Alfresco introduced a Share Clustering which is used to keep the caches in sync, so you don’t have to disable it anymore. When needed, cache invalidation messages are sent from one Share node to all others, that include runtime application properties changes as well as new/existing site/user dashboards changes.

Just like for the Repository part, it’s really easy to setup the Share Clustering so there is really no reasons not to. It’s also using Hazelcast but it’s not based on properties that you need to configure in the alfresco-global.properties (because it’s a Share configuration), this one must be done in an XML file and there is no possibilities to do that in the Alfresco Admin Console, obviously.

All Share configuration/customization are put in the “$CATALINA_HOME/shared/classes/alfresco/web-extension” folder, this one is no exception. There are two possibilities for the Share Clustering communications:

  • Multicast
  • Unicast (TCP-IP in Hazelcast)

 

I. Multicast

If you do not know how many nodes will participate in your Share Cluster or if you want to be able to add more nodes in the future without having to change the previous nodes’ configuration, then you probably want to check and opt for the Multicast option. Just create a new file “$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml” and put this content inside it:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="true" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="false">
            <hz:members></hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

In the above configuration, be sure to set a topic name (matching the hazelcastTopicName’s value) as well as a group password that is specific to this environment, so you don’t end-up with a single Cluster with members coming from different environments. For the Share layer, it’s less of an issue than for the Repository layer but still. Be sure also to use a network port that isn’t in use, it will be the port that Hazelcast will bind itself to in the local host. For Alfresco Clustering, we used 5701 so here it’s 5801 for example.

Not much more to say about this configuration, we just enabled the multicast with an IP and a port to be used and we disabled the tcp-ip one.

The interfaces is disabled by default but you can enable it, if you want to. If it’s disabled, Hazelcast will list all local interfaces (127.0.0.1, local_IP1, local_IP2, …) and it will choose one in this list. If you want to force Hazelcast to use a specific local network interface, then enable this section and add that here. In can use the following nomenclature (IP only!):

  • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
  • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 5 IPs assigned to the local host and you don’t want Hazelcast to use 3 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the custom-slingshot-application-context.xml on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
  • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above

 

For most cases, keeping the interfaces disabled is sufficient since it will just pick one available. You might think that Hazelcast may bind itself to 127.0.0.1, technically it’s possible since it’s a local network interface but I have never seen it do so, so I assume that there is some kind of preferred order if another IP is available.

Membership in Hazelcast is based on “age”, meaning that the oldest member will be the one to lead. There is no predefined Master or Slave members, they are all equal, but the oldest/first member is the one that will check if new members are allowed to join (correct config) and if so, it will send the information to all other members that joined already so they are all aligned. If multicast is enabled, a multicast listener is started to listen for new membership requests.

 

II. Unicast

If you already know how many nodes will participate in your Share Cluster or if you prefer to avoid Multicast messages (there is no real need to overload your network with such things…), then it’s preferable to use Unicast messaging. For that purpose, just create the same file as above (“$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml“) but instead, use the tcp-ip section:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="false" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="true">
            <hz:members>share_n1.domain,share_n2.domain</hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

The description is basically the same as for the Multicast part. The main difference is that the multicast was disabled, the tcp-ip was enabled and there is therefore a list of members that needs to be set. This is a comma separated list of hostname or IPs that the Hazelcast will try to contact when it starts. Membership in case of Unicast is managed in the same way except that the oldest/first member will listen for new membership requests on the TCP-IP. Therefore, it’s the same principle, it’s just done differently.

Starting the first Share node in the Cluster will display the following information on the logs:

Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n1.domain' to address(es): [127.0.0.1, 10.10.10.10]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n2.domain' to address(es): [10.10.10.11]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [share_n1.domain/10.10.10.10, share_n2.domain/10.10.10.11, share_n1.domain/127.0.0.1]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Prefer IPv4 stack is true.
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Picked Address[share_n1.domain]:5801, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5801], bind any local is true
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Hazelcast Community Edition 2.4 (20121017) starting at Address[share_n1.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Copyright (C) 2008-2012 Hazelcast.com
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTING
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot] Connecting to possible member: Address[share_n2.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot]

Members [1] {
        Member [share_n1.domain]:5801 this
}

Jul 28, 2019 11:45:37 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTED
2019-07-28 11:45:37,164  INFO  [web.site.ClusterTopicService] [localhost-startStop-1] Init complete for Hazelcast cluster - listening on topic: share_hz_test

 

Then starting a second node of the Share Cluster will display the following (still on the node1 logs):

Jul 28, 2019 11:48:31 AM com.hazelcast.nio.SocketAcceptor
INFO: [share_n1.domain]:5801 [slingshot] 5801 is accepting socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:31 AM com.hazelcast.nio.ConnectionManager
INFO: [share_n1.domain]:5801 [slingshot] 5801 accepted socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:38 AM com.hazelcast.cluster.ClusterManager
INFO: [share_n1.domain]:5801 [slingshot]

Members [2] {
        Member [share_n1.domain]:5801 this
        Member [share_n2.domain]:5801
}

 

Cet article Alfresco Clustering – Share est apparu en premier sur Blog dbi services.

Alfresco Clustering – ActiveMQ

$
0
0

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository and the Alfresco Share. In this one, I will work on the ActiveMQ layer. I recently posted something related to the setup of ActiveMQ and some initial configuration. I will therefore extend this topic in this blog with what needs to be done to have a simple Cluster for ActiveMQ. I’m not an ActiveMQ expert, I just started using it a few months ago in relation to Alfresco but still, I learned some things in this timeframe so this might be of some use.

ActiveMQ is a Messaging Server so there are therefore three sides to this component. First, there are Producers which produce messages. These messages are put in the broker’s queue which is the second side and finally there are Consumers which consume the messages from the queue. Producers and Consumers are satellites that are using the JMS broker’s queue: they are both clients. Therefore, in a standalone architecture (one broker), there is no issue because clients will always produce and consume all messages. However, if you start adding more brokers and if you aren’t doing it right, you might have producers talking to a specific broker and consumers talking to another one. To solve that, there are a few things possible:

  • a first solution is to create a Network of Brokers which will allow the different brokers to forward the necessary messages between them. You can see that as an Active/Active Cluster
    • Pros: this allows ActiveMQ to support a huge architecture with potentially hundreds or thousands of brokers
    • Cons: messages are, at any point in time, only owned by one single broker so if this broker goes down, the message is lost (if there is no persistence) or will have to wait for the broker to be restarted (if there is persistence)
  • the second solution that ActiveMQ supports is the Master/Slave one. In this architecture, all messages will be replicated from a Master to all Slave brokers. You can see that as something like an Active/Passive Cluster
    • Pros: messages are always processed and cannot be lost. If the Master broker is going down for any reasons, one of the Slave is instantly taking its place as the new Master with all the previous messages
    • Cons: since all messages are replicated, it’s much harder to support a huge architecture

In case of a Network of Brokers, it’s possible to use either the static or dynamic discovery of brokers:

  • Static discovery: Uses the static protocol to provide a list of all URIs to be tested to discover other connections. E.g.: static:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?maxReconnectDelay=3000
  • Dynamic discovery: Uses a multicast discovery agent to check for other connections. This is done using the discoveryUri parameter in the XML configuration file

 

I. Client’s configuration

On the client’s side, using several brokers is very simple since it’s all about using the correct broker URL. To be able to connect to several brokers, you should use the Failover Transport protocol which replaced the Reliable protocol used in ActiveMQ 3. For Alfresco, this broker URL needs to be updated in the alfresco-global.properties file. This is an example for a pretty simple URL with two brokers:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### ActiveMQ
messaging.broker.url=failover:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?timeout=3000&randomize=false&nested.daemon=false&nested.dynamicManagement=false
#messaging.username=
#messaging.password=
...
[alfresco@alf_n1 ~]$

 

There are a few things to note. The Failover used above is a transport layer that can be used in combination with any of the other transport methods/protocol. Here it’s used with two TCP protocol. The correct nomenclature is either:

  • failover:uri1,…,uriN
    • E.g.: failover:tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616 => the simplest broker URL for two brokers with no custom options
  • failover:uri1?URIOptions1,…,uriN?URIOptionsN
    • E.g.: failover:tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false,tcp://mq_n2.domain:61616?daemon=false&dynamicManagement=true&trace=true => a more advanced broker URL with some custom options for each of the TCP protocol URIs
  • failover:(uri1?URIOptions1,…,uriN?URIOptionsN)?FailoverTransportOptions
    • E.g.: failover:(tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false,tcp://mq_n2.domain:61616?daemon=false&dynamicManagement=true&trace=true)?timeout=3000&randomize=false => the same broker URL as above but, in addition, with some Failover Transport options
  • failover:(uri1,…,uriN)?FailoverTransportOptions&NestedURIOptions
    • E.g.: failover:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?timeout=3000&randomize=false&nested.daemon=false&nested.dynamicManagement=false&nested.trace=false => since ActiveMQ 5.9, it’s now possible to set the nested URIs options (here the TCP protocol options) at the end of the broker URL, they just need to be preceded by “nested.”. Nested options will apply to all URIs.

There are a lot of interesting parameters, these are some:

  • Failover Transport options:
    • backup=true: initialize and keep a second connection to another broker for faster failover
    • randomize=true: will pick a new URI for the reconnect randomly from the list of URIs
    • timeout=3000: time in ms before timeout on the send operations
    • priorityBackup=true: clients will failover to other brokers in case the “primary” broker isn’t available (that’s always the case) but it will consistently try to reconnect to the “primary” one. It is possible to specify several “primary” brokers with the priorityURIs option (comma separated list)
  • TCP Transport options:
    • daemon=false: specify that ActiveMQ isn’t running in a Spring or Web container
    • dynamicManagement=false: disabling the JMX management
    • trace=false: disabling the tracing

The full list of Failover Transport options is described here and the full list of TCP Transport options here.

II. Messaging Server’s configuration

I believe the simplest setup for Clustering in ActiveMQ is using the Master/Slave setup, that’s what I will talk about here. If you are looking for more information about the Network of Brokers, you can find that here. As mentioned previously, the idea behind the Master/Slave is to replicate somehow the messages to Slave brokers. To do that, there are three possible configurations:

  • Shared File System: use a shared file system
  • JDBC: use a Database Server
  • Replicated LevelDB Store: use a ZooKeeper Server. This has been deprecated in recent versions of ActiveMQ 5 in favour of KahaDB, which is a file-based persistence Database. Therefore, this actually is linked to the first configuration above (Shared File System)

In the scope of Alfresco, you should already have a shared file system as well as a shared Database Server for the Repository Clustering… So, it’s pretty easy to fill the prerequisites for ActiveMQ since you already have them. Of course, you can use a dedicated Shared File System or dedicated Database, that’s up to your requirements.

a. JDBC

For the JDBC configuration, you will need to change the persistenceAdapter to use the dedicated jdbcPersistenceAdapter and create the associated DataSource for your Database. ActiveMQ supports some DBs like Apache Derby, DB2, HSQL, MySQL, Oracle, PostgreSQL, SQLServer or Sybase. You will also need to add the JDBC library at the right location.

[alfresco@mq_n1 ~]$ cat $ACTIVEMQ_HOME/conf/activemq.xml
<beans
  xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
  http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">
  ...
  <broker xmlns="http://activemq.apache.org/schema/core" brokerName="mq_n1" dataDirectory="${activemq.data}">
    ...
    <persistenceAdapter>
      <jdbcPersistenceAdapter dataDirectory="activemq-data" dataSource="postgresql-ds"/>
    </persistenceAdapter>
    ...
  </broker>
  ...
  <bean id="postgresql-ds" class="org.postgresql.ds.PGPoolingDataSource">
    <property name="serverName" value="db_vip"/>
    <property name="databaseName" value="alfresco"/>
    <property name="portNumber" value="5432"/>
    <property name="user" value="alfresco"/>
    <property name="password" value="My+P4ssw0rd"/>
    <property name="dataSourceName" value="postgres"/>
    <property name="initialConnections" value="1"/>
    <property name="maxConnections" value="10"/>
  </bean>
  ...
</beans>
[alfresco@mq_n1 ~]$

 

b. Shared File System

The Shared File System configuration is, from my point of view, the simplest one to configure but for it to work properly, there are some things to note because you should use a shared file system that supports proper file lock. This means that:

  • you cannot use the Oracle Cluster File System (OCFS/OCFS2) because there is no cluster-aware flock or POSIX locks
  • if you are using NFS v3 or lower, you won’t have automatic failover from Master to Slave because there is no timeout and therefore the lock will never be released. You should therefore use NFS v4 instead

Additionally, you need to share the persistenceAdapter between all brokers but you cannot share the data folder completely otherwise the logs will be overwritten by all brokers (that’s bad but it’s not really an issue) and more importantly, the PID file will also be overwritten which will therefore cause issues to start/stop Slave brokers…

Therefore, configuring properly the Shared File System is all about keeping the “$ACTIVEMQ_DATA” environment variable set to the place where you want the logs and PID files to be stored (i.e. locally) and you need to overwrite the persistenceAdapter path to be on the Shared File System:

[alfresco@mq_n1 ~]$ # Root folder of the ActiveMQ binaries
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_HOME
/opt/activemq
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Location of the logs and PID file
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_DATA
/opt/activemq/data
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Location of the Shared File System
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_SHARED_DATA
/shared/file/system
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl stop activemq.service
[alfresco@mq_n1 ~]$ grep -A2 "<persistenceAdapter>" $ACTIVEMQ_HOME/conf/activemq.xml
    <persistenceAdapter>
      <kahaDB directory="${activemq.data}/kahadb"/>
    </persistenceAdapter>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Put the KahaDB into the Shared File System
[alfresco@mq_n1 ~]$ sed -i "s, directory=\"[^\"]*\", directory=\"${ACTIVEMQ_SHARED_DATA}/activemq/kahadb\"," $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -A2 "<persistenceAdapter>" $ACTIVEMQ_HOME/conf/activemq.xml
    <persistenceAdapter>
      <kahaDB directory="/shared/file/system/activemq/kahadb"/>
    </persistenceAdapter>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl start activemq.service

 

Starting the Master ActiveMQ will display some information in the log of the node1 showing that it has started properly and it will listen for connections on the different transportConnector:

[alfresco@mq_n1 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:34:37,598 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:34:37 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:34:38,289 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,330 | INFO  | KahaDB is version 6 | org.apache.activemq.store.kahadb.MessageDatabase | main
2019-07-28 11:34:38,351 | INFO  | PListStore:[/opt/activemq/data/mq_n1/tmp_storage] started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2019-07-28 11:34:38,479 | INFO  | Apache ActiveMQ 5.15.6 (mq_n1, ID:mq_n1-36925-1564306478360-0:1) is starting | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,533 | INFO  | Listening for connections at: tcp://mq_n1:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,542 | INFO  | Connector openwire started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,545 | INFO  | Listening for connections at: amqp://mq_n1:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,546 | INFO  | Connector amqp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,552 | INFO  | Listening for connections at: stomp://mq_n1:61613?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,553 | INFO  | Connector stomp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,556 | INFO  | Listening for connections at: mqtt://mq_n1:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,561 | INFO  | Connector mqtt started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,650 | WARN  | ServletContext@o.e.j.s.ServletContextHandler@11841b15{/,null,STARTING} has uncovered http methods for path: / | org.eclipse.jetty.security.SecurityHandler | main
2019-07-28 11:34:38,710 | INFO  | Listening for connections at ws://mq_n1:61614?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.ws.WSTransportServer | main
2019-07-28 11:34:38,712 | INFO  | Connector ws started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,712 | INFO  | Apache ActiveMQ 5.15.6 (mq_n1, ID:mq_n1-36925-1564306478360-0:1) started | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,714 | INFO  | For help or more information please see: http://activemq.apache.org | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:39,118 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /admin | main
2019-07-28 11:34:39,373 | INFO  | ActiveMQ WebConsole available at http://0.0.0.0:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:34:39,373 | INFO  | ActiveMQ Jolokia REST API available at http://0.0.0.0:8161/api/jolokia/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:34:39,402 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | main
2019-07-28 11:34:39,532 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /api | main
2019-07-28 11:34:39,563 | INFO  | jolokia-agent: Using policy access restrictor classpath:/jolokia-access.xml | /api | main
[alfresco@mq_n1 ~]$

 

Then starting a Slave will only display the information on the node2 logs that there is already a Master running and therefore the Slave is just waiting and it’s not listening for now:

[alfresco@mq_n2 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:35:53,258 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:35:53 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:35:53,986 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:35:53,999 | INFO  | Database /shared/file/system/activemq/kahadb/lock is locked by another server. This broker is now in slave mode waiting a lock to be acquired | org.apache.activemq.store.SharedFileLocker | main
[alfresco@mq_n2 ~]$

 

Finally stopping the Master will automatically transform the Slave into a new Master, without any human interaction. From the node2 logs:

[alfresco@mq_n2 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:35:53,258 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:35:53 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:35:53,986 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:35:53,999 | INFO  | Database /shared/file/system/activemq/kahadb/lock is locked by another server. This broker is now in slave mode waiting a lock to be acquired | org.apache.activemq.store.SharedFileLocker | main
  # The ActiveMQ Master on node1 has been stopped here (11:37:10)
2019-07-28 11:37:11,166 | INFO  | KahaDB is version 6 | org.apache.activemq.store.kahadb.MessageDatabase | main
2019-07-28 11:37:11,187 | INFO  | PListStore:[/opt/activemq/data/mq_n2/tmp_storage] started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2019-07-28 11:37:11,316 | INFO  | Apache ActiveMQ 5.15.6 (mq_n2, ID:mq_n2-41827-1564306631196-0:1) is starting | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,370 | INFO  | Listening for connections at: tcp://mq_n2:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,372 | INFO  | Connector openwire started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,379 | INFO  | Listening for connections at: amqp://mq_n2:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,381 | INFO  | Connector amqp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,386 | INFO  | Listening for connections at: stomp://mq_n2:61613?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,387 | INFO  | Connector stomp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,390 | INFO  | Listening for connections at: mqtt://mq_n2:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,391 | INFO  | Connector mqtt started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,485 | WARN  | ServletContext@o.e.j.s.ServletContextHandler@2cfbeac4{/,null,STARTING} has uncovered http methods for path: / | org.eclipse.jetty.security.SecurityHandler | main
2019-07-28 11:37:11,547 | INFO  | Listening for connections at ws://mq_n2:61614?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.ws.WSTransportServer | main
2019-07-28 11:37:11,548 | INFO  | Connector ws started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,556 | INFO  | Apache ActiveMQ 5.15.6 (mq_n2, ID:mq_n2-41827-1564306631196-0:1) started | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,558 | INFO  | For help or more information please see: http://activemq.apache.org | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,045 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /admin | main
2019-07-28 11:37:11,448 | INFO  | ActiveMQ WebConsole available at http://0.0.0.0:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:37:11,448 | INFO  | ActiveMQ Jolokia REST API available at http://0.0.0.0:8161/api/jolokia/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:37:11,478 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | main
2019-07-28 11:37:11,627 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /api | main
2019-07-28 11:37:11,664 | INFO  | jolokia-agent: Using policy access restrictor classpath:/jolokia-access.xml | /api | main
[alfresco@mq_n2 ~]$

 

You can of course customize ActiveMQ as per your requirements, remove some connectors, setup SSL, aso… But that’s not really the purpose of this blog.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – ActiveMQ est apparu en premier sur Blog dbi services.

Alfresco Clustering – Apache HTTPD as Load Balancer

$
0
0

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. In this one, I will talk about the Front-end layer, but in a very particular setup because it will also act as a Load Balancer. For an Alfresco solution, you can choose the front-end that you prefer and it can just act as a front-end to protect your Alfresco back-end components, to add SSL or whatever. There is no real preferences but you will obviously need to know how to configure it. I posted a blog some years ago for Apache HTTPD as a simple front-end (here) or you can check the Alfresco documentation which now include a section for that as well but there is no official documentation for a Load Balancer setup.

In an Alfresco architecture that includes HA/Clustering you will, at some point, need a Load Balancer. From time to time, you will come across companies that do not already have a Load Balancer available and you might therefore have to provide something to fill this gap. Since you will most probably (should?) already have a front-end to protect Alfresco, why not using it as well as a Load Balancer? In this blog, I choose Apache HTTPD because that’s the front-end I’m usually using and I know it’s working fine as a LB as well.

The architectures that I described in the first blog of this series, there always were a front-end installed on each node with Alfresco Share and there were a LB above that. Here, these two boxes are actually together. There are multiple ways to set that up but I didn’t want to talk about that in my first blog because it’s not really related to Alfresco, it’s above that so it would just have multiplied the possible architectures that I wanted to present and my blog would just have been way too long. There were also no communications between the different front-end nodes because technically speaking, we aren’t going to setup Apache HTTPD as a Cluster, we only need to provide a High Availability solution.

Alright so let’s say that you don’t have a Load Balancer available and you want to use Apache HTTPD as a front-end+LB for a two-node Cluster. There are several solutions so here are two possible ways to do that from an inbound communication point of view that will still provide redundancy:

  • Setup a Round Robin DNS that points to both Apache HTTPD node1 and node2. The DNS will redirect connections to either of the two Apache HTTPD (Active/Active)
  • Setup a Failover DNS with a pretty low TimeToLive (TTL) which will point to a single Apache HTTPD node and redirect all traffic there. If this one isn’t available, it will failover to the second one (Active/Passive)

 

In both cases above, the Apache HTTPD configuration can be exactly the same, it will work. From an outbound communication point of view, Apache HTTPD will talk directly with all the Share nodes behind it. To avoid disconnection and loss of sessions in case an Apache HTTPD is going down, the solution will need to support session stickiness across all Apache HTTPD. With that, all communications coming a single browser will always be redirected to the same backend server which ensures that the sessions are still intact, even if you are losing an Apache HTTPD. I mentioned previously that there won’t be any communications between the different front-ends so this session stickiness must be based on something present inside the session (header or cookie) or inside the URL.

With Apache HTTPD, you can use the Proxy modules to provide both a front-end configuration as well as a Load Balancer but, in this blog, I will use the JK module. The JK module is provided by Apache for communications between Apache HTTPD and Apache Tomcat. It has been designed and optimized for this purpose and it also provides/supports a Load Balancer configuration.

 

I. Apache HTTPD setup for a single back-end node

For this example, I will use the package provided by Ubuntu for a simple installation. You can obviously build it from source to customize it, add your best practices, aso… This has nothing to do with the Clustering setup, it’s a simple front-end configuration for any installation. So let’s install a basic Apache HTTPD:

[alfresco@httpd_n1 ~]$ sudo apt-get install apache2 libapache2-mod-jk
[alfresco@httpd_n1 ~]$ sudo systemctl enable apache2.service
[alfresco@httpd_n1 ~]$ sudo systemctl daemon-reload
[alfresco@httpd_n1 ~]$ sudo a2enmod rewrite
[alfresco@httpd_n1 ~]$ sudo a2enmod ssl

 

Then to configure it for a single back-end Alfresco node (I’m just showing a minimal configuration again, there is much more to do add security & restrictions around Alfresco and mod_jk):

[alfresco@httpd_n1 ~]$ cat /etc/apache2/sites-available/alfresco-ssl.conf
...
<VirtualHost *:80>
    RewriteRule ^/?(.*) https://%{HTTP_HOST}/$1 [R,L]
</VirtualHost>

<VirtualHost *:443>
    ServerName            dns.domain
    ServerAlias           dns.domain dns
    ServerAdmin           email@domain
    SSLEngine             on
    SSLProtocol           -all +TLSv1.2
    SSLCipherSuite        EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH:AES2
    SSLHonorCipherOrder   on
    SSLVerifyClient       none
    SSLCertificateFile    /etc/pki/tls/certs/dns.domain.crt
    SSLCertificateKeyFile /etc/pki/tls/private/dns.domain.key

    RewriteRule ^/$ https://%{HTTP_HOST}/share [R,L]

    JkMount /* alfworker
</VirtualHost>
...
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker
worker.alfworker.type=ajp13
worker.alfworker.port=8009
worker.alfworker.host=share_n1.domain
worker.alfworker.lbfactor=1
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo a2ensite alfresco-ssl
[alfresco@httpd_n1 ~]$ sudo a2dissite 000-default
[alfresco@httpd_n1 ~]$ sudo rm /etc/apache2/sites-enabled/000-default.conf
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo service apache2 restart

 

That should do it for a single back-end Alfresco node. Again, this was just an example, I wouldn’t recommend using the configuration as is (inside the alfresco-ssl.conf file), there is much more to do for security reasons.

 

II. Adaptation for a Load Balancer configuration

If you want to configure your Apache HTTPD as a Load Balancer, then on top of the standard setup shown above, you just have to modify two things:

  • Modify the JK module configuration to use a Load Balancer
  • Modify the Apache Tomcat configuration to add an identifier for Apache HTTPD to be able to redirect the communication to the correct back-end node (session stickiness). This ID put in the Apache Tomcat configuration will extend the Session’s ID like that: <session_id>.<tomcat_id>

 

So on all the nodes hosting the Apache HTTPD, you should put the exact same configuration:

[alfresco@httpd_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker

worker.alfworker.type=lb
worker.alfworker.balance_workers=node1,node2
worker.alfworker.sticky_session=true
worker.alfworker.method=B

worker.node1.type=ajp13
worker.node1.port=8009
worker.node1.host=share_n1.domain
worker.node1.lbfactor=1

worker.node2.type=ajp13
worker.node2.port=8009
worker.node2.host=share_n2.domain
worker.node2.lbfactor=1
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo service apache2 reload

 

With the above configuration, we keep the same JK Worker (alfworker) but instead of using a ajp13 type, we use a lb type (line 4) which is an encapsulation. The alfworker will use 2 sub-workers named node1 and node2 (line 5), that’s just a generic name. The alfworker will also enable stickiness and use the method B (Busyness), which means that for new sessions, Apache HTTPD to choose to use the worker with the less requests being served, divided by the lbfactor value.

Each sub-worker (node1 and node2) define their type which is ajp13 this time, the port and host it should target (where the Share nodes are located) and the lbfactor. As mentioned above, increasing the lbfactor means that more requests are going to be sent to this worker:

  • For the node2 to serve 100% more requests than the node1 (x2), then set worker.node1.lbfactor=1 and worker.node2.lbfactor=2
  • For the node2 to serve 50% more requests than the node1 (x1.5), then set worker.node1.lbfactor=2 and worker.node2.lbfactor=3

 

The second thing to do is to modify the Apache Tomcat configuration to add a specific ID. On the Share node1:

[alfresco@share_n1 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n1">
[alfresco@share_n1 ~]$

 

On the Share node2:

[alfresco@share_n2 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n2">
[alfresco@share_n2 ~]$

 

The value to be put in the jvmRoute parameter is just a string so it can be anything but it must be unique across all Share nodes so that the Apache HTTPD JK module can find the correct back-end node that it should transfer the requests to.

It’s that simple to configure Apache HTTPD as a Load Balancer in front of Alfresco… To check which back-end server you are currently using, you can use the browser’s utilities and in particular the network recording which will display, in the headers/cookies section, the Session ID which will therefore display the value that you put in the jvmRoute.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – Apache HTTPD as Load Balancer est apparu en premier sur Blog dbi services.

Alfresco Clustering – Solr6

$
0
0

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. I also setup an Apache HTTPD as a Load Balancer. In this one, I will talk about the last layer that I wanted to present, which is Solr and more particularly Solr6 (Alfresco Search Services) Sharding. I planned on writing a blog related to Solr Sharding Concepts & Methods to explain what it brings concretely but unfortunately, it’s not ready yet. I will try to post it in the next few weeks, if I find the time.

 

I. Solr configuration modes

So, Solr supports/provides three configuration modes:

  • Master-Slave
  • SolrCloud
  • Standalone


Master-Slave
: It’s a first specific configuration mode which is pretty old. In this one, the Master node is the only to index the content and all the Slave nodes will replicate the Master’s index. This is a first step to provide a Clustering solution with Solr, and Alfresco supports it, but this solution has some important drawbacks. For example, and contrary to an ActiveMQ Master-Slave solution, Solr cannot change the Master. Therefore, if you lose your Master, there is no indexing happening anymore and you need to manually change the configuration file on each of the remaining nodes to specify a new Master and target all the remaining Slaves nodes to use the new Master. This isn’t what I will be talking about in this blog.

SolrCloud: It’s another specific configuration mode which is a little bit more recent, introduced in Solr4 I believe. SolrCloud is a true Clustering solution using a ZooKeeper Server. It adds an additional layer on top of a Standalone Solr which is slowing it down a little bit, especially on infrastructures with a huge demand on indexing. But at some points, when you start having dozens of Solr nodes, you need a central place to organize and configure them and that’s what SolrCloud is very good at. This solution provides Fault Tolerance as well as High Availability. I’m not sure if SolrCloud could be used by Alfresco because sure SolrCloud also has Shards and its behaviour is pretty similar to a Standalone Solr but it’s not entirely working in the same way. Maybe it’s possible, however I have never seen it so far. Might be the subject of some testing later… In any cases, using a SolrCloud for Alfresco might not be that useful because it’s really easier to setup a Master-Master Solr mixed with Solr Sharding for pretty much the same benefits. So, I won’t talk about SolrCloud here either.

You guessed it, in this blog, I will only talk about Standalone Solr nodes and only using Shards. Alfresco supports Solr Shards only since the version 5.1. Before that, it wasn’t possible to use this feature, even if Solr4 provided it already. When using the two default cores (the famous “alfresco” & “archive” cores), with all Alfresco versions (all supporting Solr… So since Alfresco 4), it is possible to have a High Available Solr installation by setting up two Solr Standalone nodes and putting a Load Balancer in front of it but in this case, there is no communication between the Solr nodes so, it’s only a HA solution, nothing more.

 

In the architectures that I presented in the first blog of this series, if you remember the schema N°5 (you probably don’t but no worry, I didn’t either), I put a link between the two Solr nodes and I mentioned the following related to this architecture:
“N°5: […]. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same Shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.”

 

II. Solr Shards creation

As mentioned earlier in this blog, there are real Cluster solutions with Solr but in the case of Alfresco, because of the features that Alfresco adds like the Shard Registration, there is no real need to set up complex things like that. Having just a simple Master-Master installation of Solr6 with Sharding is already a very good and strong solution to provide Fault Tolerance, High Availability, Automatic Failover, Performance improvements, aso… So how can that be setup?

First, you will need to install at least two Solr Standalone nodes. You can use exactly the same setup for all nodes and it’s also exactly the same setup to use the default cores or Solr Sharding so just do what you are always doing. For the Tracking, you will need to use the Load Balancer URL so it can target all Repository nodes, if there are several.

If you created the default cores, you can remove them easily:

[alfresco@solr_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">524</int></lst>
</response>
* Connection #0 to host localhost left intact
[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">485</int></lst>
</response>
* Connection #0 to host localhost left intact
[alfresco@solr_n1 ~]$

 

A status of “0” means that it’s successful.

Once that’s done, you can then simply create the Shards. In this example, I will:

  • use the DB_ID_RANGE method
  • use two Solr nodes
  • for workspace://SpacesStore: create 2 Shards out of a maximum of 10 with a range of 20M
  • for archive://SpacesStore: create 1 Shard out of a maximum of 5 with a range of 50M

Since I will use only two Solr nodes and since I want a High Availability on each of the Shards, I will need to have them all on both nodes. With a simple loop, it’s pretty easy to create all the Shards:

[alfresco@solr_n1 ~]$ solr_host=localhost
[alfresco@solr_n1 ~]$ solr_node_id=1
[alfresco@solr_n1 ~]$ begin_range=0
[alfresco@solr_n1 ~]$ range=19999999
[alfresco@solr_n1 ~]$ total_shards=10
[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ for shard_id in `seq 0 1`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=alfresco&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-19999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">254</int></lst><str name="core">alfresco-0</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°0 created with: 0-19999999


*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=1&property.shard.method=DB_ID_RANGE&property.shard.range=20000000-39999999&property.shard.instance=1 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">228</int></lst><str name="core">alfresco-1</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°1 created with: 20000000-39999999

[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ begin_range=0
[alfresco@solr_n1 ~]$ range=49999999
[alfresco@solr_n1 ~]$ total_shards=4
[alfresco@solr_n1 ~]$ for shard_id in `seq 0 0`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=archive&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=4&numNodes=4&nodeInstance=1&template=rerank&coreName=archive&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-49999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 181
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">231</int></lst><str name="core">archive-0</str>
</response>
* Connection #0 to host localhost left intact

-->  Range N°0 created with: 0-49999999

[alfresco@solr_n1 ~]$

 

On the Solr node2, to create the same Shards (another Instance of each Shard) and therefore provide the expected setup, just re-execute the same commands but replacing solr_node_id=1 with solr_node_id=2. That’s all there is to do on Solr side, just creating the Shards is sufficient. On the Alfresco side, configure the Shards registration to use the Dynamic mode:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
# Solr Sharding
solr.useDynamicShardRegistration=true
search.solrShardRegistry.purgeOnInit=true
search.solrShardRegistry.shardInstanceTimeoutInSeconds=60
search.solrShardRegistry.maxAllowedReplicaTxCountDifference=500
...
[alfresco@alf_n1 ~]$

 

After a quick restart, all the Shard’s Instances will register themselves to Alfresco and you should see that each Shard has its two Shard’s Instances. Thanks to the constant Tracking, Alfresco knows which Shard’s Instances are healthy (up-to-date) and which ones aren’t (either lagging behind or completely silent). When performing searches, Alfresco will make a request to any of the healthy Shard’s Instances. Solr will be aware of the healthy Shard’s Instances as well and it will start the distribution of the search request to all the Shards for the parallel query. This is the communication between the Solr nodes that I mentioned earlier: it’s not really Clustering but rather query distribution between all the healthy Shard’s Instances.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – Solr6 est apparu en premier sur Blog dbi services.

Alfresco – Share Clustering fail with ‘Ignored XML validation warning’

$
0
0

In a recent project on Alfresco, I had to setup a Clustering environment. It all went smoothly but I did face one single issue with the setup of the Clustering on the Alfresco Share layer. That’s something I never faced before and you will understand why below.

Initially, to setup the Alfresco Share Clustering, I used the sample file packaged in the distribution zip (E.g.: alfresco-content-services-distribution-6.1.0.5.zip):

<?xml version='1.0' encoding='UTF-8'?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                http://www.hazelcast.com/schema/spring
                https://hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

   <!--
        Hazelcast distributed messaging configuration - Share web-tier cluster config
        - see http://www.hazelcast.com/docs.jsp
        - and specifically http://docs.hazelcast.org/docs/2.4/manual/html-single/#SpringIntegration
   -->
   <!-- Configure cluster to use either Multicast or direct TCP-IP messaging - multicast is default -->
   <!-- Optionally specify network interfaces - server machines likely to have more than one interface -->
   <!-- The messaging topic - the "name" is also used by the persister config below -->
   <!--
   <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="slingshot-topic"/>
   <hz:hazelcast id="webframework.cluster.slingshot">
      <hz:config>
         <hz:group name="slingshot" password="alfresco"/>
         <hz:network port="5801" port-auto-increment="true">
            <hz:join>
               <hz:multicast enabled="true"
                     multicast-group="224.2.2.5"
                     multicast-port="54327"/>
               <hz:tcp-ip enabled="false">
                  <hz:members></hz:members>
               </hz:tcp-ip>
            </hz:join>
            <hz:interfaces enabled="false">
               <hz:interface>192.168.1.*</hz:interface>
            </hz:interfaces>
         </hz:network>
      </hz:config>
   </hz:hazelcast>

   <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
      <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
      <property name="hazelcastTopicName"><value>slingshot-topic</value></property>
   </bean>
   -->

</beans>

 

I obviously uncommented the whole section and configured it properly for the Share Clustering. The above content is only the default/sample content, nothing more.

Once configured, I restarted Alfresco but it failed with the following messages:

24-Aug-2019 14:35:12.974 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
24-Aug-2019 14:35:12.974 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet Engine: Apache Tomcat/8.5.34
24-Aug-2019 14:35:12.988 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDescriptor Deploying configuration descriptor [/opt/tomcat/conf/Catalina/localhost/share.xml]
Aug 24, 2019 2:35:15 PM org.apache.jasper.servlet.TldScanner scanJars
INFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
Aug 24, 2019 2:35:15 PM org.apache.catalina.core.ApplicationContext log
INFO: No Spring WebApplicationInitializer types detected on classpath
Aug 24, 2019 2:35:15 PM org.apache.catalina.core.ApplicationContext log
INFO: Initializing Spring root WebApplicationContext
2019-08-23 14:35:16,052  WARN  [factory.xml.XmlBeanDefinitionReader] [localhost-startStop-1] Ignored XML validation warning
 org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 92; schema_reference.4: Failed to read schema document 'https://hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>.
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.warning(ErrorHandlerWrapper.java:100)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:392)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:306)
	at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaErr(XSDHandler.java:4218)
  ... 69 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
	at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
	at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
	... 89 more
...
2019-08-23 14:35:16,067  ERROR [web.context.ContextLoader] [localhost-startStop-1] Context initialization failed
 org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Failed to import bean definitions from relative location [surf-config.xml]
Offending resource: class path resource [web-application-config.xml]; nested exception is org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Failed to import bean definitions from URL location [classpath*:alfresco/web-extension/*-context.xml]
Offending resource: class path resource [surf-config.xml]; nested exception is org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 18 in XML document from file [/opt/tomcat/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 92; cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'hz:topic'.
	at org.springframework.beans.factory.parsing.FailFastProblemReporter.error(FailFastProblemReporter.java:68)
	at org.springframework.beans.factory.parsing.ReaderContext.error(ReaderContext.java:85)
	at org.springframework.beans.factory.parsing.ReaderContext.error(ReaderContext.java:76)
  ... 33 more
Caused by: org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Failed to import bean definitions from URL location [classpath*:alfresco/web-extension/*-context.xml]
Offending resource: class path resource [surf-config.xml]; nested exception is org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 18 in XML document from file [/opt/tomcat/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 92; cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'hz:topic'.
	at org.springframework.beans.factory.parsing.FailFastProblemReporter.error(FailFastProblemReporter.java:68)
	at org.springframework.beans.factory.parsing.ReaderContext.error(ReaderContext.java:85)
	at org.springframework.beans.factory.parsing.ReaderContext.error(ReaderContext.java:76)
	... 42 more
Caused by: org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 18 in XML document from file [/opt/tomcat/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 92; cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'hz:topic'.
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:397)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335)
	at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303)
	... 44 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 92; cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'hz:topic'.
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
	at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:135)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:396)
	... 64 more
...
24-Aug-2019 14:35:16.196 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.startInternal One or more listeners failed to start. Full details will be found in the appropriate container log file
24-Aug-2019 14:35:16.198 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.startInternal Context [/share] startup failed due to previous errors
Aug 24, 2019 2:35:16 PM org.apache.catalina.core.ApplicationContext log
...

 

As you can see above, the message is pretty clear: there is a problem within the file “/opt/tomcat/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml” which is causing Share to fail to start properly. The first warning message points you directly to the issue: “Failed to read schema document ‘https://hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd’

After checking the content of the sample file and comparing it with a working one, I found out what was wrong. To solve this specific issue, you can simply replace “https://hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd” with “http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd“. Please note the two differences in the URL:

  • Switch from “https” to “http
  • Switch from “hazelcast.com” to “www.hazelcast.com

 

The issue was actually caused by the fact that this installation was completely offline, with no access to internet. Because of that, Spring wasn’t able to check for the XSD file to validate the definition in the context file. The solution is therefore to switch the URL to http with www.hazelcast.com so that the Spring internal resolution can understand and use the local file to do the validation and not look for it online.

As mentioned previously, I never faced this issue before for two main reasons:

  • I usually don’t use the sample files provided by Alfresco, I always prefer to build my own
  • I mainly install Alfresco on servers which have internet access (outgoing communications allowed)

 

Once the URL is corrected, Alfresco Share is able to start and the Clustering is configured properly:

24-Aug-2019 14:37:22.558 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
24-Aug-2019 14:37:22.558 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet Engine: Apache Tomcat/8.5.34
24-Aug-2019 14:37:22.573 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDescriptor Deploying configuration descriptor [/opt/tomcat/conf/Catalina/localhost/share.xml]
Aug 24, 2019 2:37:24 PM org.apache.jasper.servlet.TldScanner scanJars
INFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
Aug 24, 2019 2:37:25 PM org.apache.catalina.core.ApplicationContext log
INFO: No Spring WebApplicationInitializer types detected on classpath
Aug 24, 2019 2:37:25 PM org.apache.catalina.core.ApplicationContext log
INFO: Initializing Spring root WebApplicationContext
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n1.domain' to address(es): [10.10.10.10]
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n2.domain' to address(es): [127.0.0.1, 10.10.10.11]
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.AddressPicker
INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [share_n1.domain/10.10.10.10, share_n2.domain/10.10.10.11, share_n2.domain/127.0.0.1]
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.AddressPicker
INFO: Prefer IPv4 stack is true.
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.AddressPicker
INFO: Picked Address[share_n2.domain]:5801, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5801], bind any local is true
Aug 24, 2019 2:37:28 PM com.hazelcast.system
INFO: [share_n2.domain]:5801 [slingshot] Hazelcast Community Edition 2.4 (20121017) starting at Address[share_n2.domain]:5801
Aug 24, 2019 2:37:28 PM com.hazelcast.system
INFO: [share_n2.domain]:5801 [slingshot] Copyright (C) 2008-2012 Hazelcast.com
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n2.domain]:5801 [slingshot] Address[share_n2.domain]:5801 is STARTING
Aug 24, 2019 2:37:28 PM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n2.domain]:5801 [slingshot] Connecting to possible member: Address[share_n1.domain]:5801
Aug 24, 2019 2:37:28 PM com.hazelcast.nio.ConnectionManager
INFO: [share_n2.domain]:5801 [slingshot] 54991 accepted socket connection from share_n1.domain/10.10.10.10:5801
Aug 24, 2019 2:37:29 PM com.hazelcast.impl.Node
INFO: [share_n2.domain]:5801 [slingshot] ** setting master address to Address[share_n1.domain]:5801
Aug 24, 2019 2:37:35 PM com.hazelcast.cluster.ClusterManager
INFO: [share_n2.domain]:5801 [slingshot]

Members [2] {
	Member [share_n1.domain]:5801
	Member [share_n2.domain]:5801 this
}

Aug 24, 2019 2:37:37 PM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n2.domain]:5801 [slingshot] Address[share_n2.domain]:5801 is STARTED
2019-08-23 14:37:37,664  INFO  [web.site.ClusterTopicService] [localhost-startStop-1] Init complete for Hazelcast cluster - listening on topic: share_hz_test
...

 

Cet article Alfresco – Share Clustering fail with ‘Ignored XML validation warning’ est apparu en premier sur Blog dbi services.

Documentum – Encryption/Decryption of WebTop 6.8 passwords ‘REJECTED’ with recent JDK

$
0
0

Recently, we had a project to modernize a little bit a pretty old Documentum installation. As part of this project, there were a refresh of the Application Server hosting a WebTop 6.8. In this blog, I will be talking about an issue that we faced in encryption & decryption of passwords in the refresh environment. This new environment was using WebLogic 12.1.3 with the latest PSU in conjunction with the JDK 1.8u192. Since WebTop 6.8 P08, the JDK 1.8u111 is supported so a newer version of the JDK8 should mostly be working without much trouble.

To properly deploy a WebTop application, you will need to encrypt some passwords like the Preferences or Preset passwords. Doing so in the new environment unfortunately failed:

[weblogic@wls_01 ~]$ work_dir=/tmp/work
[weblogic@wls_01 ~]$ cd ${work_dir}/
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ jar -xf webtop_6.8_P27.war WEB-INF/classes WEB-INF/lib
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ kc="${work_dir}/WEB-INF/classes/com/documentum/web/formext/session/KeystoreCredentials.properties"
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ sed -i "s,use_dfc_config_dir=[^$]*,use_dfc_config_dir=false," ${kc}
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ sed -i "s,keystore.file.location=[^$]*,keystore.file.location=${work_dir}," ${kc}
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ grep -E "^use_dfc_config_dir|^keystore.file.location" ${kc}
use_dfc_config_dir=false
keystore.file.location=/tmp/work
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ enc_classpath="${work_dir}/WEB-INF/classes:${work_dir}/WEB-INF/lib/*"
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ java -classpath "${enc_classpath}" com.documentum.web.formext.session.TrustedAuthenticatorTool "MyP4ssw0rd"
Aug 27, 2019 11:02:23 AM java.io.ObjectInputStream filterCheck
INFO: ObjectInputFilter REJECTED: class com.rsa.cryptoj.o.nc, array length: -1, nRefs: 1, depth: 1, bytes: 72, ex: n/a
java.security.UnrecoverableKeyException: Rejected by the jceks.key.serialFilter or jdk.serialFilter property
        at com.sun.crypto.provider.KeyProtector.unseal(KeyProtector.java:352)
        at com.sun.crypto.provider.JceKeyStore.engineGetKey(JceKeyStore.java:136)
        at java.security.KeyStoreSpi.engineGetEntry(KeyStoreSpi.java:473)
        at java.security.KeyStore.getEntry(KeyStore.java:1521)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.getSecretKey(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.decryptByDES(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorTool.main(TrustedAuthenticatorTool.java:64)
[weblogic@wls_01 work]$

 

As you can see above, the encryption of password is failing with some error. The issue is that starting with the JDK 1.8u171, Oracle introduced some new restrictions. From the Oracle release note (JDK-8189997):

New Features
security-libs/javax.crypto
Enhanced KeyStore Mechanisms
A new security property named jceks.key.serialFilter has been introduced. If this filter is configured, the JCEKS KeyStore uses it during the deserialization of the encrypted Key object stored inside a SecretKeyEntry. If it is not configured or if the filter result is UNDECIDED (for example, none of the patterns match), then the filter configured by jdk.serialFilter is consulted.

If the system property jceks.key.serialFilter is also supplied, it supersedes the security property value defined here.

The filter pattern uses the same format as jdk.serialFilter. The default pattern allows java.lang.Enum, java.security.KeyRep, java.security.KeyRep$Type, and javax.crypto.spec.SecretKeySpec but rejects all the others.

Customers storing a SecretKey that does not serialize to the above types must modify the filter to make the key extractable.

 

On recent versions of Documentum Administrator for example, there is no issue because it complies but for WebTop 6.8, it doesn’t and therefore to be able to encrypt/decrypt the password, you will have to modify the filter. There are several solutions to our current problem:

  • Downgrade the JDK: this isn’t a good solution since it might introduce security vulnerabilities and it will also prevent you to upgrade it in the future so…
  • Extend the ‘jceks.key.serialFilter‘ definition inside the ‘$JAVA_HOME/jre/lib/security/java.security‘ file: that’s a possibility but it means that any processes using this Java will use the updated filter list. Whether or not that’s fine, it’s up to you
  • Override the ‘jceks.key.serialFilter‘ definition using a JVM startup parameter on a per-process basis: better control on which processes are allowed to use updated filters and which ones aren’t

 

So the simplest way, and most probably the better way, to solve this issue is to simply add a command line parameter to specify that you want to allow some additional classes. By default, the ‘java.security‘ provides a list of some classes that are allowed and it ends with ‘!*‘ which means that everything else is forbidden.

[weblogic@wls_01 work]$ grep -A2 "^jceks.key.serialFilter" $JAVA_HOME/jre/lib/security/java.security
jceks.key.serialFilter = java.lang.Enum;java.security.KeyRep;\
  java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;!*

[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ grep "^security.provider" $JAVA_HOME/jre/lib/security/java.security
security.provider.1=com.rsa.jsafe.provider.JsafeJCE
security.provider.2=com.rsa.jsse.JsseProvider
security.provider.3=sun.security.provider.Sun
security.provider.4=sun.security.rsa.SunRsaSign
security.provider.5=sun.security.ec.SunEC
security.provider.6=com.sun.net.ssl.internal.ssl.Provider
security.provider.7=com.sun.crypto.provider.SunJCE
security.provider.8=sun.security.jgss.SunProvider
security.provider.9=com.sun.security.sasl.Provider
security.provider.10=org.jcp.xml.dsig.internal.dom.XMLDSigRI
security.provider.11=sun.security.smartcardio.SunPCSC
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ # Using an empty parameter allows everything (not the best idea)
[weblogic@wls_01 work]$ java -Djceks.key.serialFilter='' -classpath "${enc_classpath}" com.documentum.web.formext.session.TrustedAuthenticatorTool "MyP4ssw0rd"
Encrypted: [4Fc6kvmUc9cCSQXUqGkp+A==], Decrypted: [MyP4ssw0rd]
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ # Using the default value from java.security causes the issue
[weblogic@wls_01 work]$ java -Djceks.key.serialFilter='java.lang.Enum;java.security.KeyRep;java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;!*' -classpath "${enc_classpath}" com.documentum.web.formext.session.TrustedAuthenticatorTool "MyP4ssw0rd"
Aug 27, 2019 12:05:08 PM java.io.ObjectInputStream filterCheck
INFO: ObjectInputFilter REJECTED: class com.rsa.cryptoj.o.nc, array length: -1, nRefs: 1, depth: 1, bytes: 72, ex: n/a
java.security.UnrecoverableKeyException: Rejected by the jceks.key.serialFilter or jdk.serialFilter property
        at com.sun.crypto.provider.KeyProtector.unseal(KeyProtector.java:352)
        at com.sun.crypto.provider.JceKeyStore.engineGetKey(JceKeyStore.java:136)
        at java.security.KeyStoreSpi.engineGetEntry(KeyStoreSpi.java:473)
        at java.security.KeyStore.getEntry(KeyStore.java:1521)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.getSecretKey(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.encryptByDES(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorTool.main(TrustedAuthenticatorTool.java:63)
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ # Adding com.rsa.cryptoj.o.nc to the allowed list
[weblogic@wls_01 work]$ java -Djceks.key.serialFilter='com.rsa.cryptoj.o.nc;java.lang.Enum;java.security.KeyRep;java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;!*' -classpath "${enc_classpath}" com.documentum.web.formext.session.TrustedAuthenticatorTool "MyP4ssw0rd"
Aug 27, 2019 12:06:14 PM java.io.ObjectInputStream filterCheck
INFO: ObjectInputFilter REJECTED: class com.rsa.jcm.f.di, array length: -1, nRefs: 3, depth: 2, bytes: 141, ex: n/a
java.security.UnrecoverableKeyException: Rejected by the jceks.key.serialFilter or jdk.serialFilter property
        at com.sun.crypto.provider.KeyProtector.unseal(KeyProtector.java:352)
        at com.sun.crypto.provider.JceKeyStore.engineGetKey(JceKeyStore.java:136)
        at java.security.KeyStoreSpi.engineGetEntry(KeyStoreSpi.java:473)
        at java.security.KeyStore.getEntry(KeyStore.java:1521)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.getSecretKey(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorUtils.encryptByDES(Unknown Source)
        at com.documentum.web.formext.session.TrustedAuthenticatorTool.main(TrustedAuthenticatorTool.java:63)
[weblogic@wls_01 work]$
[weblogic@wls_01 work]$ # Adding com.rsa.jcm.f.* + com.rsa.cryptoj.o.nc to the allowed list
[weblogic@wls_01 work]$ java -Djceks.key.serialFilter='com.rsa.jcm.f.*;com.rsa.cryptoj.o.nc;java.lang.Enum;java.security.KeyRep;java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;!*' -classpath "${enc_classpath}" com.documentum.web.formext.session.TrustedAuthenticatorTool "MyP4ssw0rd"
Encrypted: [4Fc6kvmUc9cCSQXUqGkp+A==], Decrypted: [MyP4ssw0rd]
[weblogic@wls_01 work]$

 

So as you can see above, to encrypt passwords for WebTop 6.8 using a JDK 8u171+, you will need to add both ‘com.rsa.cryptoj.o.nc‘ and ‘com.rsa.jcm.f.*‘ in the allowed list. There is a wildcard for the JCM one because it will require several classes from this package.

The above was for the encryption of the password. That’s fine but obviously, when you will deploy WebTop, it will need to decrypt these passwords at some point… So you will also need to put the same JVM parameter for the process of your Application Server (for the Managed Server’s process in WebLogic):

-Djceks.key.serialFilter='com.rsa.jcm.f.*;com.rsa.cryptoj.o.nc;java.lang.Enum;java.security.KeyRep;java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;!*'

 

You can change the order of the classes in the list, it just needs to be before the ‘!*‘ section because everything after that is ignored.

 

Cet article Documentum – Encryption/Decryption of WebTop 6.8 passwords ‘REJECTED’ with recent JDK est apparu en premier sur Blog dbi services.

Documentum – Large documents in xPlore indexing

$
0
0

Documentum is using xPlore for the Full Text indexing/search processes. If you aren’t very familiar with how xPlore is working, you might want to know how it is possible to index large documents or you might be confused about some documents not being indexed (and therefore not searchable). In this blog, I will try to explain how xPlore can be configured to be able to index these big documents without causing too much trouble because by default, these documents are just not indexed which might be an issue. Documents tend to be bigger and bigger and therefore the thresholds for the xPlore indexing might be a little bit outdated…

In this blog, I will go through all thresholds that can be configured on the different components and I will try to explain a little bit what it’s all about. Before starting, I believe a very short (and definitively not exhaustive) introduction on the indexing process of the xPlore is therefore required. As soon as you install an IndexAgent, it will trigger the creation of several things on the associated repository, including new events registration on the ‘dmi_registry‘. When working with documents, these events (‘dm_save‘, ‘dm_saveasnew‘, …) will generate new entries in the ‘dmi_queue_item‘. The IndexAgent will then access the ‘dmi_queue_item‘ and retrieve the document that needs indexing (add/update/remove from index). Then from here a CPS is called and processing the document (language identification, text extraction, tokenization, lemmatization, stemming, …). My point here is that there are two main sides of the indexing process: the IndexAgent and then the CPS. This is also true for the thresholds: you will need to configure them properly on both sides.

 

I. IndexAgent

 
On the IndexAgent side, there isn’t much configuration possible strictly related to the size of documents since there is only one but it’s kind of the most important one since it’s the first barrier that will block your indexing if not configured properly.

In the file indexagent.xml (found under $JBOSS_HOME/server/DctmServer_Indexagent/deployments/IndexAgent.war/WEB-INF/classes), in the exporter section, you can find the parameter ‘contentSizeLimit‘. This parameter controls the maximum size of a document that can be send to indexing. This is the real size of the document (‘content_size‘/’full_content_size‘); it is not the size of its text once extracted. The reason for that is simple: this limit is on the IndexAgent size and the text hasn’t been extracted yet so the IndexAgent do not know how big the extracted text will be. If the size of the document exceeds the value defined for ‘contentSizeLimit‘, then the IndexAgent will not even try to process it, it will just reject it and in this case, you will see a message that the document exceeded the limit on both the IndexAgent logs as well as in the ‘dmi_queue_item‘ object. Other documents of the same batch aren’t impacted, the parameter ‘contentSizeLimit‘ is for each and every document. The default value for this parameter is 20 000 000 bytes (19,07 MB).

If you are going to change this value, then you might need some other updates. You can tweak some other parameters if you are seeing issues while indexing large documents, all of them can be configured inside this indexagent.xml file. For example, you might want to look at the ‘content_clean_interval‘ (in milliseconds) which controls when the export of the document (dftxml document) will be removed from the staging area of the IndexAgent (location of ‘local_content_area‘). If the value is too small, then the CPS might try to retrieve a file to process it for indexing but the IndexAgent might have removed the file already. The default value for this parameter is 1 200 000 (20 minutes).

 

II. CPS

 
On the CPS side, you can look at several other size related parameters. You can find these parameters (and many others) in two main locations. The first is global to the Federation: indexserverconfig.xml (found under $XPLORE_HOME/config by default but you can change it (E.g.: a shared location for a Multi-Node FT)). The second one is a CPS-specific configuration file: PrimaryDsearch_local_configuration.xml for a PrimaryDsearch or <CPS_Name>_configuration.xml for a CPS Only (found under $XPLORE_HOME/dsearch/cps/cps_daemon/).

The first parameter to look for is ‘max_text_threshold‘. This parameter controls the maximum text size of a document. This is the size of its text after extraction; it is not the real size of the document. If the text size of the document exceeds the value defined for ‘max_text_threshold‘, then the CPS will act according to the value defined for the ‘cut_off_text‘. With a ‘cut_off_text‘ set to true, the documents that exceed ‘max_text_threshold‘ will have the first ‘max_text_threshold‘ MB indexed but the CPS will stop once it reached the limit. In this case the CPS log will contain something like ‘doc**** is partially processed’ and the dftxml of this document will contain the mention ‘partialIndexed‘. This means that the CPS stopped at the limit defined and therefore the index might be missing some content. With a ‘cut_off_text‘ set to false (default value), the documents that exceed ‘max_text_threshold‘ will be rejected and therefore not full text indexed at all (only metadata is indexed). Other documents of the same batch aren’t impacted, the parameter ‘max_text_threshold‘ is for each and every document. The default value for this parameter is 10 485 760 bytes (10 MB) and the maximum value possible is 2 147 483 648 (2 GB).

The second parameter to look for is ‘max_data_per_process‘. This parameter controls the maximum text size that a CPS Batch should handle. The CPS is indexing documents/items per batches (‘CPS-requests-batch-size‘). By default, a CPS will process up to 5 documents per batch but, if I’m not mistaken, it can be less if there isn’t enough documents to process. If the total text size to be processed by the CPS for the complete batch is above ‘max_data_per_process‘, then the CPS will reject the full batch and it will therefore not full text index the content of any of the documents. This is going to be an issue if you increased the previous parameters but miss/forget this one. Indeed, you might end-up with very small documents not indexed because they were in a batch containing some big documents. To be sure that this parameter doesn’t block any batch, you can set it to ‘CPS-requests-batch-size‘*’max_text_threshold‘. The default value for this parameter is 31 457 280 bytes (30 MB) and the maximum value possible is 2 147 483 648 (2 GB).

As for the IndexAgent, if you are going to change these values, then you might need some other updates. There are a few timeouts values like ‘request_time_out‘ (default 600 seconds), ‘text_extraction_time_out‘ (between 60 and 300 – default to 300 seconds) or ‘linguistic_processing_time_out‘ (between 60 and 360 – default to 360 seconds) that are probably going to be exceeded if you are processing large documents so you might need to tweak these values.

 

III. Summary

 

Parameter Limit on Short Description Default Value Sample Value
contentSizeLimit IndexAgent (indexagent.xml) Maximum size of document 20 000 000 bytes (19,07 MB) 104 857 600 bytes (100 MB)
max_text_threshold CPS (*_configuration.xml) Maximum text size of the document’s content 10 485 760 bytes (10 MB) 41 943 040 bytes (40 MB)
max_data_per_process CPS (*_configuration.xml) Maximum text size of the CPS batch 31 457 280 bytes (30 MB) 5*41 943 040 bytes = 209 715 200 (200 MB)

 
In summary, the first factor to consider is ‘contentSizeLimit‘ on the IndexAgent side. All documents with a size (document size) bigger than ‘contentSizeLimit‘ won’t be submitted to full text indexing, they will be skipped. The second factor is then either ‘max_text_threshold‘ or ‘max_data_per_process‘ or both, it depends which size you assigned them. They both rely on text size after extraction and they can both cause a document (or the batch) to be rejected from indexing.

Increasing the size thresholds is a somewhat complex exercise that needs careful thinking and alignment of numerous satellite parameters so that they can all work together without disrupting the performance or stability of the xPlore processes. These satellite parameters can be timeouts, cleanup, batch size, request size or even JVM size.

 

Cet article Documentum – Large documents in xPlore indexing est apparu en premier sur Blog dbi services.


Solr Sharding – Concepts & Methods

$
0
0

A few weeks ago, I published a series of blog on the Alfresco Clustering, including Solr Sharding. At that time, I planned to first explain what is really the Solr Sharding, what are the different concepts and methods around it. Unfortunately, I didn’t get the time to write this blog so I had to post the one related to Solr even before explaining the basics. Today, I’m here to rights my wrong! Obviously, this blog has a focus on Alfresco related Solr Sharding since that’s what I do.

I. Solr Sharding – Concepts

The Sharding in general is the partitioning of a set of data in a specific way. There are several possibilities to do that, depending on the technology you are working on. In the scope of Solr, the Sharding is therefore the split of the Solr index into several smaller indices. You might be interested in the Solr Sharding because it improves the following points:

  • Fault Tolerance: with a single index, if you lose it, then… you lost it. If the index is split into several indices, then even if you are losing one part, you will still have all others that will continue working
  • High Availability: it provides more granularity than the single index. You might want for example to have a few small indices without HA and then have some others with HA because you configured them to contain some really important nodes of your repository
  • Automatic Failover: Alfresco knows automatically (with Dynamic Registration) which Shards are up-to-date and which ones are lagging behind so it will choose automatically the best Shards to handle the search queries so that you get the best results possible. In combination with the Fault Tolerance above, this gives the best possible HA solution with the less possible resources
  • Performance improvements: better performance in indexing since you will have several Shards indexing the same repository so you can have less work done by each Shards for example (depends on Sharding Method). Better performance in searches since the search query will be processes by all Shards in parallel on smaller parts of the index instead of being one single query on the full index

Based on benchmarks, Alfresco considers that a Solr Shard can contain up to 50 to 80 000 000 nodes. This is obviously not a hard limit, you can have a single Shard with 200 000 000 nodes but it is more of a best practice if you want to keep a fast and reliable index. With older versions of Alfresco (before the version 5.1), you couldn’t create Shards because Alfresco didn’t support it. So, at that time, there were no other solutions than having a single big index.

There is one additional thing that must be understood here: the 50 000 000 nodes soft limit is 50M nodes in the index, not in the repository. Let’s assume that you are using a DB_ID_RANGE method (see below for the explanation) with an assumed split of 65% live nodes, 20% archived nodes, 15% others (not indexed: renditions, other stores, …). So, if we are talking about the “workspace://SpacesStore” nodes (live ones), then if we want to fill a Shard with 50M nodes, we will have to use a DB_ID_RANGE of 100*50M/65 = 77M. Basically, the Shard should be more or less “full” once there are 77M IDs in the Database. For the “archive://SpacesStore” nodes (archived ones), it would be 100*50M/20 = 250M.

Alright so what are the main concepts in the Solr Sharding? There are several terms that need to be understood:

  • Node: It’s a Solr Server (a Solr installed using the Alfresco Search Services). Below, I will use “Solr Server” instead because I already use “nodes” (lowercase) for the Alfresco Documents so using “Node” (uppercase) for the Solr Server, it might be a little bit confusing…
  • Cluster: It’s a set of Solr Servers all working together to index the same repository
  • Shard: A part of the index. In other words, it’s a representation (virtual concept) of the index composed of a certain set of nodes (Alfresco Documents)
  • Shard Instance: It’s one Instance of a specific Shard. A Shard is like a virtual concept while the Instance is the implementation of that virtual concept for that piece of the index. Several Shard Instances of the same Shard will therefore contain the same set of Alfresco nodes
  • Shard Group: It’s a collection of Shards (several indices) that forms a complete index. Shards are part of the same index (same Shard Group) if they:
    • Track the same store (E.g.: workspace://SpacesStore)
    • Use the same template (E.g.: rerank)
    • Have the same number of Shards max (“numShards“)
    • Use the same configuration (Sharding methods, Solr settings, …)

Shard is often (wrongly) used in place of Shard Instance which might lead to some confusion… When you are reading “Shard”, sometimes it means the Shard itself (the virtual concept), sometimes it’s all its Shard Instances. This is these concepts can look like:
Solr Sharding - Concepts

II. Solr Sharding – Methods

Alfresco supports several methods for the Solr Sharding and they all have different attributes and different ways of working:

  • MOD_ACL_ID (ACL v1): Alfresco nodes and ACLs are grouped by their ACL ID and stored together in the same Shard. Different ACL IDs will be assigned randomly to different Shards (depending on the number of Shards you defined). Each Alfresco node using a specific ACL ID will be stored in the Shard already containing this ACL ID. This simplifies the search requests from Solr since ACLs and nodes are together, so permission checking is simple. If you have a lot of documents using the same ACL, then the distribution will not be even between Shards. Parameters:
    • shard.method=MOD_ACL_ID
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>
  • ACL_ID (ACL v2): This is the same as the MOD_ACL_ID, the only difference is that it changes the method to assign to ACL to the Shards so it is more evenly distributed but if you still have a lot of documents using the same ACL then you still have the same issue. Parameters:
    • shard.method=ACL_ID
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>
  • DB_ID: This is the default Sharding Method in Solr 6 which will evenly distribute the nodes in the different Shards based on their DB ID (“alf_node.id“). The ACLs are replicated on each of the Shards so that Solr is able to perform the permission checking. If you have a lot of ACLs, then this will obviously make the Shards a little bit bigger, but this is usually insignificant. Parameters:
    • shard.method=DB_ID
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>
  • DB_ID_RANGE: Pretty much the same thing as the DB_ID but instead of looking into each DB ID one by one, it will just dispatch the DB IDs from the same range into the same Shard. The ranges are predefined at the Shard Instance creation and you cannot change them later, but this is also the only Sharding Method that allows you to add new Shards dynamically (auto-scaling) without the need to perform a full reindex. The lower value of the range is included and the upper value is excluded (for Math lovers: [begin-end[ ;)). Since DB IDs are incremental (increase over time), performing a search query with a date filter might end-up as simple as checking inside a single Shard. Parameters:
    • shard.method=DB_ID_RANGE
    • shard.range=<begin-end>
    • shard.instance=<shard.instance>
  • DATE: Months will be assigned to a specific Shard sequentially and then nodes are indexed into the Shard that was assigned the current month. Therefore, if you have 2 Shards, each one will contain 6 months (Shard 1 = Months 1,3,5,7,9,11 // Shard 2 = Months 2,4,6,8,10,12). It is possible to assign consecutive months to the same Shard using the “shard.date.grouping” parameter which defines how many months should be grouped together (a semester for example). If there is no date on a node, the fallback method is to use DB_ID instead. Parameters:
    • shard.method=DATE
    • shard.key=exif:dateTimeOriginal
    • shard.date.grouping=<1-12>
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>
  • PROPERTY: A property is specified as the base for the Shard assignment. The first time that a node is indexed with a new value for this property, the node will be assigned randomly to a Shard. Each node coming in with the same value for this property will be assigned to the same Shard. Valid properties are either d:text (single line text), d:date (date only) or d:datetime (date+time). It is possible to use only a part of the property’s value using “shard.regex” (To keep only the first 4 digit of a date for example: shard.regex=^\d{4}). If this property doesn’t exist on a node or if the regex doesn’t match (if any is specified), the fallback method is to use DB_ID instead. Parameters:
    • shard.method=PROPERTY
    • shard.key=cm:creator
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>
  • EXPLICIT_ID: Pretty much similar to the PROPERTY but instead of using the value of a “random” property, this method requires a specific property (d:text) to define explicitly on which Shard the node should be indexed. Therefore, this will require an update of the Data Model to have one property dedicated to the assignment of a node to a Shard. In case you are using several types of documents, then you will potentially want to do that for all. If this property doesn’t exist on a node or if an invalid Shard number is given, the fallback method is to use DB_ID instead. Parameters:
    • shard.method=EXPLICIT_ID
    • shard.key=<property> (E.g.: cm:targetShardInstance)
    • shard.instance=<shard.instance>
    • shard.count=<shard.count>

As you can see above, each Sharding Method has its own set of properties. You can define these properties in:

  • The template’s solrcore.properties file in which case it will apply to all Shard Instance creations
    • E.g.: $SOLR_HOME/solrhome/templates/rerank/conf/solrcore.properties
  • The URL/Command used to create the Shard Instance in which case it will only apply to the current Shard Instance creation
    • E.g.: curl -v “http://host:port/solr/admin/cores?action=newCore&…&property.shard.method=DB_ID_RANGE&property.shard.range=0-50000000&property.shard.instance=0

Summary of the benefits of each method:
Solr Sharding - Benefits

First supported versions for the Solr Sharding in Alfresco:
Solr Sharding - Availability

Hopefully, this is a good first look into the Solr Sharding. In a future blog, I will talk about the creation process and show some example of what is possible. If you want to read more on the subject, don’t hesitate to take a look at the Alfresco documentation, it doesn’t explain everything, but it is still a very good starting point.

Cet article Solr Sharding – Concepts & Methods est apparu en premier sur Blog dbi services.

Documentum – Database password validation rule in 16.4

$
0
0

A few months ago, I started working with the CS 16.4 (always using silent installation) and I had the pleasant surprise to see a new error message in the installation log. It’s always such a pleasure to lose time on pretty stupid things like the one I will talk about in this blog.

So what’s the issue? Well upon installing a new repository, I saw an error message around the start of the silent installation. In the end, the process didn’t stop and the repository was actually installed and functional – as far as I could see – but I needed to check this deeper, to be sure that there were no problem. This is an extract of the installation log showing the exact error message:

[dmadmin@documentum-server-0 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-0 logs]$ cat install.log
14:45:02,608  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: UniversalServerConfigurator
14:45:02,608  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
14:45:02,608  INFO [main]  -
14:45:02,660  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
14:45:02,698  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerInformation - Setting CONFIGURE_DOCBROKER value to TRUE for SERVER
14:45:02,699  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerInformation - Setting CONFIGURE_DOCBASE value to TRUE for SERVER
14:45:03,701  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer was started using the dm_launch_server_config_program.sh script.
14:45:03,701  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer will determine the value of environment variable DOCUMENTUM.
14:45:06,702  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer will determine the value of environment variable PATH.
14:45:09,709 ERROR [main] com.documentum.install.server.installanywhere.actions.DiWAServerValidteVariables - Invalid database user password. Valid database user password rules are:
1. Must contain only ASCII alphanumeric characters,'.', '_' and '-'.
Please enter a valid database user password.
14:45:09,717  INFO [main]  - The license file:/app/dctm/server/dba/tcs_license exists.
14:45:09,721  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentConfigurationInstallationValidation - Start to validate docbase parameters.
14:45:09,723  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerPatchExistingDocbaseAction - The installer will obtain all the DOCBASE on the machine.
14:45:11,742  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerDocAppFolder - The installer will obtain all the DocApps which could be installed for the repository.
...
[dmadmin@documentum-server-0 logs]$

 

As you can see above, the error message is self-explanatory: the Database password used doesn’t comply with the so-called “rules”. Seeing this kind of message, you would expect the installer to stop since the password doesn’t comply. It shouldn’t install the Repository. Yet it just skip it and completes without problem.

On my side, I have always been using the same rule for passwords in Documentum: at least 1 lowercase, 1 uppercase, 1 figure, 1 special character and a total of 15 or more characters. Just comparing the password that has been used for the Database with what is printed on the log, the only reason why the password wouldn’t be correct is because I put a ‘+’ in it. In previous versions of Documentum, I often used a ‘+’ and I never had any issue or errors with it.

So I checked with the OpenText Support (#4240691) to have more details on what is happening here. Turns out that starting with the CS 16.4, OpenText added a new password validation for the Database account and that this password must indeed only contain alphanumeric characters, ‘.’, ‘_’ or ‘-‘… So they added a password validation which complains but it’s not doing anything. Actually, it’s even worse because the CS Team added this password validation with the CS 16.4 and they enforced this rule but only for the GUI installer. The same check was added only later to the silent installation but it was not enforced at that time. That’s the reason why if you would try using the same password on the GUI, it should fail while with the silent installation, it prints an error but it still complete successfully… Therefore, with the same binaries, you have two different behaviors. That’s pretty cool, right? Right? RIGHT?

In the end, a new defect (#CS-121161) has been raised and they will enforce the rule in a coming patch it seems. Therefore, if you are planning to use ‘+’ characters in your Database passwords, consider changing it upfront to avoid a failure in the Repository installation. Looks like this time I should have stayed quiet and maybe I would have been able to use ‘+’ for the next 10 years using the silent installations… Sorry!

 

Cet article Documentum – Database password validation rule in 16.4 est apparu en premier sur Blog dbi services.

Documentum – Usage of K8s Services to install Documentum?

$
0
0

In the past several months, we have been extensively working on setting up a CI/CD pipeline for Documentum at one of our customer. As part of this project, we are using Kubernetes pods for Documentum components. In this blog, I will talk about an issue caused by what seemed like a good idea but finally, not so much…

The goal of this project is to migrate dozens of Documentum environments and several hundred of VMs into K8s pods. In order to streamline the migration and simplify the management, we thought: why not try to use K8s Services (ingres) for all the communications between the pods as well as external to K8s. Indeed, we needed to take into account several interfaces outside of the K8s world, usually some old software that would most probably never support containerization and such. These interfaces will need to continue to work in the way they used to so we will need K8s Services at some point for the communications between Documentum and these external interfaces. Therefore, the idea was to try to use this exact same K8s Services to install the Documentum components.

By default, K8s will create a headless service for each of the pods, which is composed in the following way: <pod_name>.<service_name>.<namespace_name>.<cluster>. The goal here was therefore to define a K8s Service in addition for each Content Servers: <service_name_ext>.<namespace_name>.<cluster>. This is what has been used:

  • Primary Content Server:
    • headless/pod: documentum-server-0.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs01.dbi-ns01.svc.cluster.local
  • Remote Content Server:
    • headless/pod: documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs02.dbi-ns01.svc.cluster.local
  • Repository & Service: gr_repo

On a typical VM, you would usually install Documentum using the VM hostname. The pendant on K8s would therefore be to use the headless/pod name. Alternatively, on a VM, you could think about using a DNS entry to install Documentum and you might think that this should work. I sure did and therefore, we tried to use the same kind of thing on K8s with the K8s Services directly.

Doing so for the Primary Content Server, all the Documentum silent installers completed successfully. We used “cs01.dbi-ns01.svc.cluster.local” for the following things for example:

  • Docbroker projections
  • Repository installation
  • DFC & CS Projections
  • BPM/xCP installation

Therefore, looking into the silent properties file for the Repository for example, it contained the following:

[dmadmin@documentum-server-0 ~]$ grep -E "FQDN|HOST" CS_Docbase_Global.properties
SERVER.FQDN=cs01.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_DOCBROKER_HOST=cs01.dbi-ns01.svc.cluster.local
[dmadmin@documentum-server-0 ~]$

 

At the end of our silent installation (include Documentum silent installers + dbi services’ best practices (other stuff like security, JMS configuration, projections, jobs, aso…)), connection to the repository was possible, D2 & DA were both working properly so it looked like being a first good step. Unfortunately, when I was doing a review of the repository objects later, I saw some wrong objects and a bit of a mess in the repository: that’s the full purpose of this blog, to explain what went wrong when using a K8s Service instead of the headless/pod name.

After a quick review, I found the following things that were wrong/messy:

  • dm_jms_config object
    • Expected: for a Primary Content Server, you should have one JMS config object with “do_mail”, “do_method” and “SAMLAuthentication” at least (+ “do_bpm” for BPM/xCP, Indexagent ones, aso…)
      • JMS <FQDN>:9080 for gr_repo.gr_repo
    • Actual: the installer created two JMS Objects, one with a correct name (using FQDN provided in installer = K8s Service), one with a wrong name (using pod name (short-name, no domain))
      • JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo => Correct one and it contained all the needed servlets (“do_mail”, “do_method”, “do_bpm” and “SAMLAuthentication”)
      • JMS documentum-server-0:9080 for gr_repo.gr_repo => Wrong one and it contained all the do_ servlets but not the SAML one strangely (“do_mail”, “do_method” and “do_bpm” only, not “SAMLAuthentication”)
  • dm_acs_config object
    • Expected: just like for the JMS, you would expect the object to be created with the FQDN you gave it in the installer
      • <FQDN>ACS1
    • Actual: the installer create the ACS config object using the headless/pod name (full-name this time and not the short-name)
      • documentum-server-0.documentum-server.dbi-ns01.svc.cluster.localACS1
  • A lot of other references to the headless/pod name: dm_user, dm_job, dm_client_registration, dm_client_rights, aso…

So in short, sometimes the Repository installer uses the FQDN provided (K8s Service) and sometimes it doesn’t. So what’s the point in providing a FQDN during the installation since it will anyway ignore it for 90% of the objects? In addition, it also creates two JMS config objects at the same time but with different names and different servlets. Looking at the “dm_jms_config_setup.out” log file created by the installer when it executed the JMS config object creation, you can see that it mention the creation of only one object and yet at the ends, it says that there are two:

[dmadmin@documentum-server-0 ~]$ cat $DOCUMENTUM/dba/config/gr_repo/dm_jms_config_setup.out
/app/dctm/server/product/16.4/bin/dm_jms_admin.sh -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080                                               -jms_proximity 1 -webapps ServerApps -server_config_id 3d0f123450000102
2019-10-21 09:50:55 UTC:  Input arguments are: -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080 -jm                                              s_proximity 1 -webapps ServerApps -server_config_id 3d0f123450000102
2019-10-21 09:50:55 UTC:  Input parameters are: {jms_port=[9080], server_config_id=[3d0f123450000102], docbase=[gr_repo.gr_repo], webapps=[ServerApps], action=[add,enableDFC,testDFC,migrate,dumpServerCache,listAll], jms_pr                                              oximity=[1], jms_host_name=[cs01.dbi-ns01.svc.cluster.local], username=[dmadmin]}
2019-10-21 09:50:55 UTC:  ======================================================================================
2019-10-21 09:50:55 UTC:  Begin administering JMS config objects in docbase gr_repo.gr_repo ...
2019-10-21 09:51:01 UTC:  The following JMS config object has been successfully created/updated in docbase gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a3
                      JMS Host Name: cs01.dbi-ns01.svc.cluster.local
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 2
             Servlet to URI Mapping:
                          do_method = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMethods/servlet/DoMethod
                 SAMLAuthentication = http://cs01.dbi-ns01.svc.cluster.local:9080/SAMLAuthentication/servlet/ValidateSAMLResponse
                            do_mail = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Successfully enabled principal_auth_priv for current DFC client  in docbase gr_repo
2019-10-21 09:51:01 UTC:  Successfully tested principal_auth_priv for current DFC client  in docbase gr_repo
2019-10-21 09:51:01 UTC:  Successfully migrated content server 3d0f123450000102 to use JMS config object(s)
2019-10-21 09:51:01 UTC:  Dump of JMS Config List in content server cache, content server is gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  USER ATTRIBUTES

  jms_list_last_refreshed         : Mon Oct 21 09:51:01 2019
  incr_wait_time_on_failure       : 30
  max_wait_time_on_failure        : 3600
  current_jms_index               : -1
  jms_config_id                [0]: 080f1234500010a3
                               [1]: 080f1234500010a4
  jms_config_name              [0]: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                               [1]: JMS documentum-server-0:9080 for gr_repo.gr_repo
  server_config_id             [0]: 3d0f123450000102
                               [1]: 3d0f123450000102
  server_config_name           [0]: gr_repo
                               [1]: gr_repo
  jms_to_cs_proximity          [0]: 2
                               [1]: 1
  is_disabled_in_docbase       [0]: F
                               [1]: F
  is_marked_dead_in_cache      [0]: F
                               [1]: F
  intended_purpose             [0]: DM_JMS_PURPOSE_FOR_LOAD_BALANCING
                               [1]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS
  last_failure_time            [0]: N/A
                               [1]: N/A
  next_retry_time              [0]: N/A
                               [1]: N/A
  failure_count                [0]: 0
                               [1]: 0

SYSTEM ATTRIBUTES


APPLICATION ATTRIBUTES


INTERNAL ATTRIBUTES


2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Total 2 JMS Config objects found in docbase gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a3
                      JMS Host Name: cs01.dbi-ns01.svc.cluster.local
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 2
             Servlet to URI Mapping:
                          do_method = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMethods/servlet/DoMethod
                 SAMLAuthentication = http://cs01.dbi-ns01.svc.cluster.local:9080/SAMLAuthentication/servlet/ValidateSAMLResponse
                            do_mail = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS documentum-server-0:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a4
                      JMS Host Name: documentum-server-0
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 1
             Servlet to URI Mapping:
                          do_method = http://documentum-server-0:9080/DmMethods/servlet/DoMethod
                            do_mail = http://documentum-server-0:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Done administering JMS config objects in docbase gr_repo.gr_repo: status=SUCCESS ...
2019-10-21 09:51:01 UTC:  ======================================================================================
Program exit status = 0 = SUCCESS
Connect to docbase gr_repo.gr_repo as user dmadmin.
Start running dm_jms_config_setup.ebs script on docbase gr_repo.gr_repo
[DM_API_E_NO_MATCH]error:  "There was no match in the docbase for the qualification: dm_method where object_name='dm_JMSAdminConsole'"


dm_method dm_JMSAdminConsole object does not exist, yet.
jarFile = /app/dctm/server/product/16.4/lib/dmjmsadmin.jar
wrapper_script = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh
Create dm_method dm_JMSAdminConsole object in docbase now
new dm_JMSAdminConsole dm_method object created in docbase successfully
new object id is: 100f123450001098
Begin updating JMS_LOCATION for Java Methods ...
Assign JMS_LOCATION=ANY to a_extended_properties in method object CTSAdminMethod
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_transition_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_schedule_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_batch_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_validate_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_event_template_sender
Done updating JMS_LOCATION for Java Methods ...
Begin create default JMS config object for content server
Content Server version: 16.4.0110.0167  Linux64.Oracle
Content Server ID: 3d0f123450000102
dm_jms_config type id = 030f12345000017c
jms_count = 0
wrapper_script = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh
script_params =  -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll  -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080 -jms_proximity 1 -webapps Server                                              Apps  -server_config_id 3d0f123450000102
cmd = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh  -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll  -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_po                                              rt 9080 -jms_proximity 1 -webapps ServerApps  -server_config_id 3d0f123450000102
status = 0
Finished creating default JMS config object for content server
Finished running dm_jms_config_setup.ebs...
Disconnect from the docbase.
[dmadmin@documentum-server-0 ~]$

 

In the log file above, there is no mention of “do_bpm” because it’s the installation of the Repository and therefore, at that time, the BPM/xCP isn’t installed yet. We only install it later, switch the URLs in HTTPS and other things. So looking into the objects in the Repository, this is what we can see at the end of all installations (I purposely only executed the HTTP->HTTPS + BPM/xCP addition but not JMS Projections to keep below the default value added by the installer, which are also wrong):

[dmadmin@documentum-server-0 ~]$ iapi gr_repo
Please enter a user (dmadmin):
Please enter password for dmadmin:


        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2018. OpenText Corporation
        All rights reserved.
        Client Library Release 16.4.0110.0058


Connecting to Server using docbase gr_repo
[DM_SESSION_I_SESSION_START]info:  "Session 010f12345000117c started for user dmadmin."


Connected to OpenText Documentum Server running Release 16.4.0110.0167  Linux64.Oracle
Session id is s0
API> ?,c,select count(*) from dm_server_config;
count(*)
----------------------
                     1
(1 row affected)

API> ?,c,select r_object_id, object_name, app_server_name, app_server_uri from dm_server_config order by object_name, app_server_name;
r_object_id       object_name  app_server_name  app_server_uri
----------------  -----------  ---------------  -----------------------------------------------------------------------
3d0f123450000102  gr_repo      do_bpm           https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                               do_mail          https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                               do_method        https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
(1 row affected)

API> ?,c,select count(*) from dm_jms_config;
count(*)
----------------------
                     2
(1 row affected)

API> ?,c,select r_object_id, object_name from dm_jms_config order by object_name;
r_object_id       object_name
----------------  ------------------------------------------------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo
(2 rows affected)

API> dump,c,080f1234500010a3
...
USER ATTRIBUTES

  object_name                     : JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
  title                           :
  subject                         :
  authors                       []: <none>
  keywords                      []: <none>
  resolution_label                :
  owner_name                      : dmadmin
  owner_permit                    : 7
  group_name                      : docu
  group_permit                    : 5
  world_permit                    : 3
  log_entry                       :
  acl_domain                      : dmadmin
  acl_name                        : dm_450f123450000101
  language_code                   :
  server_config_id             [0]: 3d0f123450000102
  config_type                     : 2
  servlet_name                 [0]: do_method
                               [1]: SAMLAuthentication
                               [2]: do_mail
                               [3]: do_bpm
  base_uri                     [0]: https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
                               [1]: https://cs01.dbi-ns01.svc.cluster.local:9082/SAMLAuthentication/servlet/ValidateSAMLResponse
                               [2]: https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                               [3]: https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
  supported_protocol           [0]: https
                               [1]: https
                               [2]: https
                               [3]: https
  projection_netloc_enable      []: <none>
  projection_netloc_ident       []: <none>
  projection_enable            [0]: T
  projection_proximity_value   [0]: 2
  projection_targets           [0]: documentum-server-0
  projection_ports             [0]: 0
  network_locations             []: <none>
  server_major_version            :
  server_minor_version            :
  is_disabled                     : F

SYSTEM ATTRIBUTES

  r_object_type                   : dm_jms_config
  r_creation_date                 : 10/21/2019 09:51:00
  r_modify_date                   : 10/21/2019 10:49:08
  r_modifier                      : dmadmin
  r_access_date                   : nulldate
  r_composite_id                []: <none>
  r_composite_label             []: <none>
  r_component_label             []: <none>
  r_order_no                    []: <none>
  r_link_cnt                      : 0
  r_link_high_cnt                 : 0
  r_assembled_from_id             : 0000000000000000
  r_frzn_assembly_cnt             : 0
  r_has_frzn_assembly             : F
  r_is_virtual_doc                : 0
  r_page_cnt                      : 0
  r_content_size                  : 0
  r_lock_owner                    :
  r_lock_date                     : nulldate
  r_lock_machine                  :
  r_version_label              [0]: 1.0
                               [1]: CURRENT
  r_immutable_flag                : F
  r_frozen_flag                   : F
  r_has_events                    : F
  r_creator_name                  : dmadmin
  r_is_public                     : T
  r_policy_id                     : 0000000000000000
  r_resume_state                  : 0
  r_current_state                 : 0
  r_alias_set_id                  : 0000000000000000
  r_full_content_size             : 0
  r_aspect_name                 []: <none>
  r_object_id                     : 080f1234500010a3

APPLICATION ATTRIBUTES

  a_application_type              :
  a_status                        :
  a_is_hidden                     : F
  a_retention_date                : nulldate
  a_archive                       : F
  a_compound_architecture         :
  a_link_resolved                 : F
  a_content_type                  :
  a_full_text                     : T
  a_storage_type                  :
  a_special_app                   :
  a_effective_date              []: <none>
  a_expiration_date             []: <none>
  a_publish_formats             []: <none>
  a_effective_label             []: <none>
  a_effective_flag              []: <none>
  a_category                      :
  a_is_template                   : F
  a_controlling_app               :
  a_extended_properties         []: <none>
  a_is_signed                     : F
  a_last_review_date              : nulldate

INTERNAL ATTRIBUTES

  i_is_deleted                    : F
  i_reference_cnt                 : 1
  i_has_folder                    : T
  i_folder_id                  [0]: 0c0f123450000105
  i_contents_id                   : 0000000000000000
  i_cabinet_id                    : 0c0f123450000105
  i_antecedent_id                 : 0000000000000000
  i_chronicle_id                  : 080f1234500010a3
  i_latest_flag                   : T
  i_branch_cnt                    : 0
  i_direct_dsc                    : F
  i_is_reference                  : F
  i_retain_until                  : nulldate
  i_retainer_id                 []: <none>
  i_partition                     : 0
  i_is_replica                    : F
  i_vstamp                        : 4

API> dump,c,080f1234500010a4
...
USER ATTRIBUTES

  object_name                     : JMS documentum-server-0:9080 for gr_repo.gr_repo
  title                           :
  subject                         :
  authors                       []: <none>
  keywords                      []: <none>
  resolution_label                :
  owner_name                      : dmadmin
  owner_permit                    : 7
  group_name                      : docu
  group_permit                    : 5
  world_permit                    : 3
  log_entry                       :
  acl_domain                      : dmadmin
  acl_name                        : dm_450f123450000101
  language_code                   :
  server_config_id             [0]: 3d0f123450000102
  config_type                     : 2
  servlet_name                 [0]: do_method
                               [1]: do_mail
                               [2]: do_bpm
  base_uri                     [0]: https://documentum-server-0:9082/DmMethods/servlet/DoMethod
                               [1]: https://documentum-server-0:9082/DmMail/servlet/DoMail
                               [2]: https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
  supported_protocol           [0]: https
                               [1]: https
                               [2]: https
  projection_netloc_enable      []: <none>
  projection_netloc_ident       []: <none>
  projection_enable            [0]: T
  projection_proximity_value   [0]: 1
  projection_targets           [0]: documentum-server-0
  projection_ports             [0]: 0
  network_locations             []: <none>
  server_major_version            :
  server_minor_version            :
  is_disabled                     : F

SYSTEM ATTRIBUTES

  r_object_type                   : dm_jms_config
  r_creation_date                 : 10/21/2019 09:51:01
  r_modify_date                   : 10/21/2019 10:50:20
  r_modifier                      : dmadmin
  r_access_date                   : nulldate
  r_composite_id                []: <none>
  r_composite_label             []: <none>
  r_component_label             []: <none>
  r_order_no                    []: <none>
  r_link_cnt                      : 0
  r_link_high_cnt                 : 0
  r_assembled_from_id             : 0000000000000000
  r_frzn_assembly_cnt             : 0
  r_has_frzn_assembly             : F
  r_is_virtual_doc                : 0
  r_page_cnt                      : 0
  r_content_size                  : 0
  r_lock_owner                    :
  r_lock_date                     : nulldate
  r_lock_machine                  :
  r_version_label              [0]: 1.0
                               [1]: CURRENT
  r_immutable_flag                : F
  r_frozen_flag                   : F
  r_has_events                    : F
  r_creator_name                  : dmadmin
  r_is_public                     : T
  r_policy_id                     : 0000000000000000
  r_resume_state                  : 0
  r_current_state                 : 0
  r_alias_set_id                  : 0000000000000000
  r_full_content_size             : 0
  r_aspect_name                 []: <none>
  r_object_id                     : 080f1234500010a4

APPLICATION ATTRIBUTES

  a_application_type              :
  a_status                        :
  a_is_hidden                     : F
  a_retention_date                : nulldate
  a_archive                       : F
  a_compound_architecture         :
  a_link_resolved                 : F
  a_content_type                  :
  a_full_text                     : T
  a_storage_type                  :
  a_special_app                   :
  a_effective_date              []: <none>
  a_expiration_date             []: <none>
  a_publish_formats             []: <none>
  a_effective_label             []: <none>
  a_effective_flag              []: <none>
  a_category                      :
  a_is_template                   : F
  a_controlling_app               :
  a_extended_properties         []: <none>
  a_is_signed                     : F
  a_last_review_date              : nulldate

INTERNAL ATTRIBUTES

  i_is_deleted                    : F
  i_reference_cnt                 : 1
  i_has_folder                    : T
  i_folder_id                  [0]: 0c0f123450000105
  i_contents_id                   : 0000000000000000
  i_cabinet_id                    : 0c0f123450000105
  i_antecedent_id                 : 0000000000000000
  i_chronicle_id                  : 080f1234500010a4
  i_latest_flag                   : T
  i_branch_cnt                    : 0
  i_direct_dsc                    : F
  i_is_reference                  : F
  i_retain_until                  : nulldate
  i_retainer_id                 []: <none>
  i_partition                     : 0
  i_is_replica                    : F
  i_vstamp                        : 2

API> ?,c,select r_object_id, object_name, servlet_name, supported_protocol, base_uri from dm_jms_config order by object_name, servlet_name;
r_object_id       object_name                                                   servlet_name        supported_protocol  base_uri                                                                                                                  
----------------  ------------------------------------------------------------  ------------------  ------------------  --------------------------------------------------------------------------------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo  SAMLAuthentication  https               https://cs01.dbi-ns01.svc.cluster.local:9082/SAMLAuthentication/servlet/ValidateSAMLResponse
                                                                                do_bpm              https               https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                                                                                do_mail             https               https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                                                                                do_method           https               https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo              do_bpm              https               https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                                                                                do_mail             https               https://documentum-server-0:9082/DmMail/servlet/DoMail
                                                                                do_method           https               https://documentum-server-0:9082/DmMethods/servlet/DoMethod
(2 rows affected)

API> ?,c,select r_object_id, object_name, projection_enable, projection_proximity_value, projection_ports, projection_targets from dm_jms_config order by object_name, projection_targets;
r_object_id       object_name                                                   projection_enable  projection_proximity_value  projection_ports  projection_targets
----------------  ------------------------------------------------------------  -----------------  --------------------------  ----------------  -------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo                  1                           2                 0  documentum-server-0
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo                              1                           1                 0  documentum-server-0
(2 rows affected)

API> ?,c,select count(*) from dm_acs_config;
count(*)
----------------------
                     1
(1 row affected)

API> ?,c,select r_object_id, object_name, acs_supported_protocol, acs_base_url from dm_acs_config order by object_name, acs_base_url;
r_object_id       object_name                                                           acs_supported_protocol  acs_base_url
----------------  --------------------------------------------------------------------  ----------------------  ------------------------------------------------------------
080f123450000490  documentum-server-0.documentum-server.dbi-ns01.svc.cluster.localACS1  https                   https://cs01.dbi-ns01.svc.cluster.local:9082/ACS/servlet/ACS
(1 row affected)

API> exit
Bye
[dmadmin@documentum-server-0 ~]$

 

So what to do with that? Well a simple solution is to just remove the wrong JMS config object (the second one) and redo the JMS Projections. You can stay with the wrong name of the ACS config object and other wrong references: even if it’s ugly, it will be working properly, it’s really just the second JMS config object that might cause you some trouble. Either scripting all that so it’s done properly in the end or doing it manually but then obviously when you have a project with a few hundred Content Servers, a simple manual task can become a nightmare ;). Another obvious solution is to not use the K8s Service but stick with the headless/pod name. With this second solution, you might as well try to use the MigrationUtil utility to change all references to the hostname after the installation is done. That would be something interesting to test!

 

Cet article Documentum – Usage of K8s Services to install Documentum? est apparu en premier sur Blog dbi services.

Documentum – FQDN Validation on RCS/CFS

$
0
0

In a previous blog, I talked about the possible usage of K8s Services in place of the default headless/pod name and the issues that it brings. This one can be seen as a continuation since it is also related to the usage of K8s Services to install Documentum but this time with another issue that is specific to a RCS/CFS. This issue & solution might be interesting for you, even if you aren’t using K8s.

As mentioned in this previous blog, the installation of a Primary CS using K8s Services is possible but it might bring you some trouble with a few repository objects. To go further with the testing, without fixing the issues on the first CS, we tried to install a RCS/CFS (second CS for the High Availability) with the exact same parameters. As a reminder, this is what has been used:

  • Primary Content Server:
    • headless/pod: documentum-server-0.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs01.dbi-ns01.svc.cluster.local
  • Remote Content Server:
    • headless/pod: documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs02.dbi-ns01.svc.cluster.local
  • Repository & Service: gr_repo

Therefore, the Repository silent properties file contained the following on this second CS:

[dmadmin@documentum-server-1 ~]$ grep -E "FQDN|HOST" RCS_Docbase_Global.properties
SERVER.FQDN=cs02.dbi-ns01.svc.cluster.local
SERVER.REPOSITORY_HOSTNAME=cs01.dbi-ns01.svc.cluster.local
SERVER.PRIMARY_CONNECTION_BROKER_HOST=cs01.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_CONNECTION_BROKER_HOST=cs02.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_DOCBROKER_HOST_OTHER=cs01.dbi-ns01.svc.cluster.local
[dmadmin@documentum-server-1 ~]$

 

I started the silent installation of the Repository and after a few seconds, the installer exited. Obviously, it means that something went wrong. Checking at the installation logs:

[dmadmin@documentum-server-1 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-1 logs]$ cat install.log
13:42:26,225  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: CfsConfigurator
13:42:26,225  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
13:42:26,225  INFO [main]  -
13:42:26,308  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
13:42:26,332  INFO [main] com.documentum.install.multinode.cfs.installanywhere.actions.DiWAServerCfsInitializeImportantServerVariables - The installer is gathering system configuration information.
13:42:26,349  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Start to verify the password
13:42:29,357  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - FQDN is invalid
13:42:29,359 ERROR [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Fail to reach the computer with the FQDN "cs02.dbi-ns01.svc.cluster.local". Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.
com.documentum.install.shared.common.error.DiException: Fail to reach the computer with the FQDN "cs02.dbi-ns01.svc.cluster.local". Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.
        at com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation.setup(DiWASilentRemoteServerValidation.java:64)
        at com.documentum.install.shared.installanywhere.actions.InstallWizardAction.install(InstallWizardAction.java:73)
        at com.zerog.ia.installer.actions.CustomAction.installSelf(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.an(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        ...
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runPreInstall(Unknown Source)
        at com.zerog.ia.installer.LifeCycleManager.consoleInstallMain(Unknown Source)
        at com.zerog.ia.installer.LifeCycleManager.executeApplication(Unknown Source)
        at com.zerog.ia.installer.Main.main(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.zerog.lax.LAX.launch(Unknown Source)
        at com.zerog.lax.LAX.main(Unknown Source)
[dmadmin@documentum-server-1 logs]$

 

On the Primary CS, the installation using the K8s Service went smoothly without error but on the Remote CS with the exact same setup, it failed with the message: ‘Fail to reach the computer with the FQDN “cs02.dbi-ns01.svc.cluster.local”. Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.‘. So the installer binaries behave differently if it’s a PCS or a RCS/CFS. Another funny thing is the message that says ‘Click Yes to ignore this error, or click No to re-enter the FQDN‘… That’s obviously a GUI message that is being printed to the logs but fortunately, the silent installer isn’t just waiting for an input that will never come.

I assumed that this had something to do with the K8s Services and some kind of network/hostname validation that the RCS/CFS installer is trying to do (which isn’t done on the Primary). Therefore, I tried a few things like checking the nslookup & ping, validating that the docbroker is responding:

[dmadmin@documentum-server-1 logs]$ nslookup cs01.dbi-ns01.svc.cluster.local
Server: 1.1.1.10
Address: 1.1.1.10#53

Name: cs01.dbi-ns01.svc.cluster.local
Address: 1.1.1.100
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ ping cs01.dbi-ns01.svc.cluster.local
PING cs01.dbi-ns01.svc.cluster.local (1.1.1.100) 56(84) bytes of data.
^C
--- cs01.dbi-ns01.svc.cluster.local ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 10999ms
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ dmqdocbroker -t cs01.dbi-ns01.svc.cluster.local -p 1489 -c ping
dmqdocbroker: A DocBroker Query Tool
dmqdocbroker: Documentum Client Library Version: 16.4.0110.0058
Using specified port: 1489
Successful reply from docbroker at host (documentum-server-0) on port(1490) running software version (16.4.0110.0167  Linux64).
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ nslookup cs02.dbi-ns01.svc.cluster.local
Server: 1.1.1.10
Address: 1.1.1.10#53

Name: cs02.dbi-ns01.svc.cluster.local
Address: 1.1.1.200
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ ping cs02.dbi-ns01.svc.cluster.local
PING cs02.dbi-ns01.svc.cluster.local (1.1.1.200) 56(84) bytes of data.
^C
--- cs02.dbi-ns01.svc.cluster.local ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 10999ms
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ dmqdocbroker -t cs02.dbi-ns01.svc.cluster.local -p 1489 -c ping
dmqdocbroker: A DocBroker Query Tool
dmqdocbroker: Documentum Client Library Version: 16.4.0110.0058
Using specified port: 1489
Successful reply from docbroker at host (documentum-server-1) on port(1490) running software version (16.4.0110.0167  Linux64).
[dmadmin@documentum-server-1 logs]$

 

As you can see above, same result for the Primary CS and the Remote one. The only thing not responding is the ping but that’s because it’s a K8s Service… At this point, I assumed that the RCS/CFS installer is trying to do something like a ping which fails and therefore the error on the log and the stop of the installer. To validate that, I simply updated a little bit the file /etc/hosts (as root obviously):

[root@documentum-server-1 ~]$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1       localhost ip6-localhost ip6-loopback
fe00::0   ip6-localnet
fe00::0   ip6-mcastprefix
fe00::1   ip6-allnodes
fe00::2   ip6-allrouters
1.1.1.200  documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local  documentum-server-1
[root@documentum-server-1 ~]$
[root@documentum-server-1 ~]$ echo '1.1.1.200  cs02.dbi-ns01.svc.cluster.local' >> /etc/hosts
[root@documentum-server-1 ~]$
[root@documentum-server-1 ~]$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1       localhost ip6-localhost ip6-loopback
fe00::0   ip6-localnet
fe00::0   ip6-mcastprefix
fe00::1   ip6-allnodes
fe00::2   ip6-allrouters
1.1.1.200  documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local  documentum-server-1
1.1.1.200  cs02.dbi-ns01.svc.cluster.local
[root@documentum-server-1 ~]$

 

After doing that, I tried again to start the RCS/CFS installer in silent (exact same command, no changes to the properties file and this time, it was able to complete the installation without issue.

[dmadmin@documentum-server-1 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-1 logs]$ cat install.log
14:01:33,199 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: CfsConfigurator
14:01:33,199 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
14:01:33,199 INFO [main] -
14:01:33,247 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
14:01:33,278 INFO [main] com.documentum.install.multinode.cfs.installanywhere.actions.DiWAServerCfsInitializeImportantServerVariables - The installer is gathering system configuration information.
14:01:33,296 INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Start to verify the password
14:01:33,906 INFO [main] com.documentum.fc.client.security.impl.JKSKeystoreUtilForDfc - keystore file name is /tmp/089972.tmp/dfc.keystore
14:01:34,394 INFO [main] com.documentum.fc.client.security.internal.CreateIdentityCredential$MultiFormatPKIKeyPair - generated RSA (2,048-bit strength) mutiformat key pair in 468 ms
14:01:34,428 INFO [main] com.documentum.fc.client.security.internal.CreateIdentityCredential - certificate created for DFC <CN=dfc_MlM5tLi5T9u1r82AdbulKv14vr8a,O=EMC,OU=Documentum> valid from Tue Sep 10 13:56:33 UTC 2019 to Fri Sep 07 14:01:33 UTC 2029:
14:01:34,429 INFO [main] com.documentum.fc.client.security.impl.JKSKeystoreUtilForDfc - keystore file name is /tmp/089972.tmp/dfc.keystore
14:01:34,446 INFO [main] com.documentum.fc.client.security.impl.InitializeKeystoreForDfc - [DFC_SECURITY_IDENTITY_INITIALIZED] Initialized new identity in keystore, DFC alias=dfc, identity=dfc_MlM5tLi5T9u1r82AdbulKv14vr8a
14:01:34,448 INFO [main] com.documentum.fc.client.security.impl.AuthenticationMgrForDfc - identity for authentication is dfc_MlM5tLi5T9u1r82AdbulKv14vr8a
14:01:34,449 INFO [main] com.documentum.fc.impl.RuntimeContext - DFC Version is 16.4.0110.0058
14:01:34,472 INFO [Timer-3] com.documentum.fc.client.impl.bof.cache.ClassCacheManager$CacheCleanupTask - [DFC_BOF_RUNNING_CLEANUP] Running class cache cleanup task
...
[dmadmin@documentum-server-1 logs]$

 

Since this looks obviously as a bug, I opened a SR with the OpenText Support (#4252205). The outcome of this ticket is that the RCS/CFS installer is indeed doing a different validation that what is done by the PCS installer and that’s why the issue is only for RCS/CFS. At the moment, there is no way to skip this validation when using the silent installer (contrary to the GUI which allows you to ‘click Yes‘). Therefore, OpenText decided to add a new parameter starting with the CS 16.4 P20 (end of December 2019) to check whether the FQDN validation should be done or just skipped. This new parameter will be “SERVER.VALIDATE_FQDN” and it will be a Boolean value. The default value will be set to “true” and therefore by default, it will do the FQDN validation. To skip it starting with the P20, just put the value to false and the RCS/CFS installer should be able to complete successfully. To be tested once the patch is out!

 

Cet article Documentum – FQDN Validation on RCS/CFS est apparu en premier sur Blog dbi services.

Documentum – FT – FTIntegrity Tool

$
0
0

FTIntegrity is used to verify indexing after migration or to get a status after a normal indexing. The tool is a standalone Java program that verifies all types that are registered in the dmi_registry_table with the user dm_fulltext_index_user. By default, the utility compares the object ID and i_vstamp between the repository and xPlore.

In this blog I will share with you the default FTIntegrity capabilities, and also some optional configurations.

Default Configuration

Preparation

A script is given to execute the FTIntegrity Tool, but some preparation are needed before the first use:

  • Verify that the IndexAgent is started
  • Define the docbase name:
    export DCTM_DOCBASE_NAME="MYREPO"
  • Navigate to XPLORE_HOME/setup/indexagent/tools
  • Substitute the repository instance owner password in the script ftintegrity_for_${DCTM_DOCBASE_NAME}.sh, but of course for security reasons it is highly recommanded to don’t put the password into the script, the workaround is to store the password:
    stty -echo;read MDP;stty echo
  • Update the script to take the password as a parameter:
    sed -i 's/password_change_me/$1/g' ftintegrity_for_${DCTM_DOCBASE_NAME}.sh

Execution

Now you can start the FT Integrity tool, usually it takes long time so it is recommended to execute it with nohup to continue running even if the session is disconnected:

nohup ./ftintegrity_for_${DCTM_DOCBASE_NAME}.sh $MDP > ftintegrity_for_${DCTM_DOCBASE_NAME}_$(date +%Y%m%d).log &

Result

Wait the end of the execution, then check files generated under XPLORE_HOME/setup/indexagent/tools:

-rw-r-----. 1 xplore xplore    15640 Dec  3 09:22 ObjectId-indexOnly.txt
-rw-r-----. 1 xplore xplore     7734 Dec  3 09:25 ObjectId-common-version-mismatch.txt
-rw-r-----. 1 xplore xplore  4237982 Dec  3 09:25 ObjectId-common-version-match.txt
-rw-r-----. 1 xplore xplore 20715741 Dec  3 09:25 ObjectId-dctmOnly.txt
-rw-r-----. 1 xplore xplore     1267 Dec  3 09:25 ftintegrity_for_MYREPO_20191203.log

The script generates four reports:

  • ObjectId-common-version-match.txt
    This file contains the object IDs and i_vstamp values of all objects in the index and the repository and having identical i_vstamp values in both places.
  • ObjectId-common-version-mismatch.txt
    This file records all objects in the index and the repository with identical object IDs but nonmatching i_vstamp values. For each object, it records the objectID, i_vstamp value in the repository, and i_vstamp value in the index. The mismatch is on objects that were modified during or after migration.
  • ObjectId-dctmOnly.txt
    This report contains the object IDs and i_vstamp values of objects in the repository but not in the index. The objects in this report could be documents that failed indexing, documents that were filtered out, or new objects generated after migration.
  • ObjectId-indexOnly.txt
    This report contains the object IDs and i_vstamp values of objects in the index but not in the repository.
    These objects were removed from the repository during or after migration, before the event has updated the index.
  • You can resubmit lists (version-mismatch, dctmOnly, and indexOnly) to align the index with the docbase. To do so start the IndexAgent in normal mode, then click Object File and browse to the file.

    Optional Configuration

    Use filter

    As you can see below the dctm Only is huge so it contains a lot of documents:

-rw-r-----. 1 xplore xplore 20715741 Dec  3 09:25 ObjectId-dctmOnly.txt

This is due to the fact that some filters are configured in the Docbase, and avoid indexing some documents, to check these filters execute the below API query:

?,c,select r_object_id,object_name,primary_class from dmc_module where any a_interfaces='com.documentum.fc.indexagent.IDfCustomIndexFilter'

In my case I get the below result:

r_object_id				object_name																			primary_class
----------------		----------------------------------------------------------------------------		----------------------------------------------------------------------------------
0b01e24080000883		com.documentum.services.message.impl.type.MailMessageChildFilter					com.documentum.services.message.impl.type.MailMessageChildFilter
0b01e24080000884		com.documentum.services.message.impl.type.MailMessageChildFilter..J5_D65			com.documentum.services.message.impl.type.MailMessageChildFilter
0b01e24080000bef		com.documentum.server.impl.fulltext.indexagent.filter.defaultCabinetFilterAction	com.documentum.server.impl.fulltext.indexagent.filter.defaultCabinetFilterAction
0b01e24080000bf0		com.documentum.server.impl.fulltext.indexagent.filter.defaultFolderFilterAction		com.documentum.server.impl.fulltext.indexagent.filter.defaultFolderFilterAction
0b01e24080000bf3		com.documentum.server.impl.fulltext.indexagent.filter.defaultTypeFilterAction		com.documentum.server.impl.fulltext.indexagent.filter.defaultTypeFilterAction
(5 rows affected)

By default the FTIntegrity don’t take in account the filters, to include the filters update the script and add “-useFilter T”, use the below command:

sed -i 's/\/& -useFilter T/' ftintegrity_for_${DCTM_DOCBASE_NAME}.sh

Check specific type

Use the option -checkType to check a specific object type, use the below command to update:

sed -i '/BATCH_SIZE/s/$/ -checkType dm_document/' ftintegrity_for_${DCTM_DOCBASE_NAME}.sh

In this way, only dm_document are verified.

Compare specific metadata

You can compare metadata values which compares object IDs and the specified attributes, to do so use the option -checkMetadata. Be careful, when this option is used you have to add -checkType option, for example, to compare only the a_is_hidden attribute:
Define the list of attributes in a file (one attribute by line):

cat $XPLORE_HOME/setup/indexagent/tools/doclist.txt
a_is_hidden

Then specify the list of metadata to be checked by the script:

...
... $DSEARCH_DOMAIN $OUTPUT_FILEPATH $BATCH_SIZE -useFilter T -checkType dm_document -CheckMetadata XPLORE_HOME/setup/indexagent/tools/doclist.txt

As a result, you will have two new reports:

  • Object-Metadata-mismatch.txt
    Contains all the objects with metadata that has inconsistencies.
  • Object-Metadata-match.txt
    Contains all the objects with metadata that has valid consistencies.

In this way, you can compare really everything! Don’t hesitate to ask or share your experience with this tool 😉

Cet article Documentum – FT – FTIntegrity Tool est apparu en premier sur Blog dbi services.

Viewing all 167 articles
Browse latest View live