Keeping an eye on DFS-R

I’ve never really used DFS-R with many file servers but here I am. And the company I work with has been having issues with replication since a member had been taken offline and back.

Due to maintenance they had disabled the member from the namespace servers of that namespace and disabled it from the memberships. When the server was up again, they re-enabled memberships for DFS and replication accordingly

To me, it seems like a good idea but somehow this member is now having issues. Nothing gets in or out as for replications go.

First of, there are a few essential reads for DFS-R. I really recommend spending some time reading those as a refresher or even guidance.

http://blogs.technet.com/b/askds/archive/2007/10/05/top-10-common-causes-of-slow-replication-with-dfsr.aspx

http://blogs.technet.com/b/askds/archive/2010/11/01/common-dfsr-configuration-mistakes-and-oversights.aspx

Also I highly recommends this article shall you want to remember what is means to be a member of the Replication Group.

http://www.adshotgyan.com/2010/12/dfsr-replication-group-in-windows-2008.html

 

Here we go. I just want to share a few things I do and look for when I starting looking for troubles.

Check the Health of a Replication Group (RG)

> dfsradmin health new /rgname:RG_NAME /refmemname:FROM_THE_VIEW_OF_WHICH_MEMBER /domain:YOUR_DOMAIN /ReportName:c:\scripts\dfsmonitor\health\RGNAME_health_rpt.html

This report will tell you what is it doing and if it encountered any errors without having to parse the event logs. Shall you want to automate this, have a look at this basic script.

Do a file replication/propagation test

dfsrdiag offers many options, one of them is to test propagation of a file to the various members of the RG. You would do the following

> dfsrdiag.exe propagationtest /rgname:RG_NAME /rfname:REPLICATION_FOLDER_TO_TEST /testfile:A_FUNNY_NAME

Operation Succeeded

You then will want to wait a little bit for things to happen and after a few minutes verify what has happened with the propagationreport option.

> dfsrdiag.exe propagationreport /rgname:RG_NAME /rfname:REPLICATION_FOLDER_TO_TEST /testfile:A_FUNNY_NAME /reportfile:c:\scripts\dfsmonitor\propagationtest.xml

PROCESSING MEMBER A[1 OUT OF 3]
PROCESSING MEMBER B[2 OUT OF 3]
PROCESSING MEMBER C[3 OUT OF 3]

Total number of members: 3

Number of disabled members: 0

Number of unsubscribed members: 0

Number of invalid AD member objects: 0

Test file access failures: 0

WMI access failures: 0

ID record search failures: 0

Test file mismatches: 0

Members with valid test file: 3

Operation Succeeded

Verify backlogs

>dfsrdiag.exe backlog /rgname:RG_NAME /rfname:REPLICATION_FOLDER_TO_TEST /smem:SOURCE_MEMBER /rmem:TARGET_MEMBER

This will give the number of files in the queue and will list them if you wish. This said it will only output 100 by default.

You can also use this script in order to keep track of the backlogs.

Restart the DFS-R service

If things seems stuck after you checked everything above and the DFS-R log usually located in c:\windows\debug\log. You might want to restart the service once

> Get-Service *dfsr* | Restart-Service

WARNING: Waiting for service ‘DFS Replication (DFSR)’ to finish stopping…

> Get-Service *dfsr*

Status   Name               DisplayName

—–   —-               ———–

Running DFSR               DFS Replication

In the end

If you have read all this and still not sure how to fix your replication issues: contact microsoft. Any mistake while playing with members and their RGs may render files inaccessible for your users. This said I am yet to find a good way to monitor all of my RGs beside loading those health reports. I wish there was a better way to check what is going on, what files it is working on, the queues, the state of the members and so on in a single MMC.

I also want to list here a few other article that may help troubleshooting and recreating your RGs.

Clearing Conflits and Deleted folders

http://blogs.technet.com/b/askds/archive/2008/10/06/manually-clearing-the-conflictanddeleted-folder-in-dfsr.aspx

Preseeding another member

http://blogs.technet.com/b/askds/archive/2010/09/07/replacing-dfsr-member-hardware-or-os-part-2-pre-seeding.aspx

 

 

Changing Timezone on Debian

Just because I am not used to work on this distro. Here is how to change the time zone settings after installation. It is not tzconfig anymore it is dpkg-reconfigure tzdata.

root ~# date
Mon Jul 21 20:08:55 UTC 2014
root ~# cat /etc/timezone
Etc/UTC
root ~# tzconfig
WARNING: the tzconfig command is deprecated, please use:
dpkg-reconfigure tzdata
root ~# dpkg-reconfigure tzdata

Current default time zone: ‘America/Toronto’
Local time is now:      Mon Jul 21 16:10:11 EDT 2014.
Universal Time is now:  Mon Jul 21 20:10:11 UTC 2014.

root ~# date
Mon Jul 21 16:10:18 EDT 2014

 

Working with EMC NAR Files

I had the (un) pleasure to work with EMC arrays again. While I though i had been saved their market share caught up to me and now I have to deal with it.

The context is the following: I have a few VNX arrays and I need to understand how they perform prior to trying doing some sancopy lun migrations.

The logic is the following:
1. turn logging on (obviously )
2. generate NAR files
3. work with those NAR files (you will need unisphere analyzer)

Turning Logging on
In order to actually get anything, you need to enable data logging.

1. on the array Storage System Properties, on the general tab, in the Configuration section, Check the “Statistics logging” box.

2. under system> monitoring and alerts launch the Statistics screen. In the settings section, configure the Performance Data Logging with the following:

Real Time interval, I like to use 300 for preliminary findings, I would use less if I need to correlate with event more specific events.

Archive interval, I am not sure why but I use double the value to cover 2 intervals in 1 NAR file. Yes, the archive interval will give you the number of NAR files.

Check Periodic Archiving in order to create the NAR file inline with your archive interval as per above. If you don’t you’ll want to use the performance screen as you sit in front of them and as they would be update as fast as the Real Time Interval you had set.

For the sake of not forgetting, always set a end date for the capture.

 

Generate the NAR files

If you had checked the Periodic Archiving, then your NAR files will be available. Use the Archive Management section from the System > Monitoring and Alerts> Statistics to Retrieve your Archive.

Download them to your machine in order to play with them. Note that you can also Delete those if you see fit. And yes those files will stay there after the Data logging is over.

Use them

Once you had downloaded the ones from the time frame you need or all of them in order to get a full picture, you may need to merge them. Indeed, you will need to use the Archive Management to load one file at a time and review the statistics.

You can use the powershell script below to have ns do the merging of many NAR files. If you only have 2 you need to merge, Unisphere’s merge archive feature will work for you.

$TempFile = “C:\Temp\nar_out\Temp.nar”
$OutputFile = “C:\Temp\nar_out\NAR_Merge.nar”
$NarFiles = Get-Childitem “C:\Temp\nar_in”
$filecount = 0
foreach($NFile in $NarFiles){
if($filecount -eq 0){Copy-Item $NFile.FullName $OutputFile}Else{
Copy-Item $OutputFile $TempFile
Remove-Item $OutputFile
./NaviSECCli.exe analyzer -archivemerge -data $TempFile $NFile.FullName -out $OutputFile
}
$filecount ++
}

Then load the result in unisphere analyzer and off you go for hours of fun tracking which SP or LUN is performing worst.

Also I had used the DYI heatmap script here, but also remember something called Analyzer Help that kind of do the same thing but I cannot figure how where to download it from.

How do you process your information (NAR  files) for them to make sense?