Quantcast
Channel: SQL Archives - SQL Authority with Pinal Dave
Viewing all 594 articles
Browse latest View live

SQL SERVER – FIX: Error – The Job Failed. Unable to Determine If The Owner Domain\User of Job Job_Name Has Server Access

$
0
0

SQL SERVER - FIX: Error - The Job Failed. Unable to Determine If The Owner Domain\User of Job Job_Name Has Server Access reunion I keep on breaking my lab environment and have learned many things from it. Here is one of the recent errors which I fixed. In this blog, we would learn how to fix error “The job failed. Unable to determine if the owner <Domain\User> of job <Job_Name> has server access.” While executing a SQL Agent job.

Here is the complete error message which I saw in the logs.

The job failed.  Unable to determine if the owner (SQLAuth\SQLSvc) of job MntPlan.RebuildIndex_UpdStats has server access (reason: Could not obtain information about Windows NT group/user ‘SQLAuth\SQLSvc’, error code 0x5. SQLSTATE 42000 (Error 15404))

As the error message stated, the Job Owner doesn’t have access on the instance. Actually, I deleted the user from the AD to reproduce the error. This means the user had access when the Maintenance Plan was created but not anymore.

WORKAROUND/SOLUTION

Since this was a maintenance plan, we can update job owner but I have seen an issue with this. It overwrites the setting when the maintenance plan is edited and saved. So, the right way to modify the owner of the maintenance plan by using T-SQL. (We are making it “sa”)
USE MSDB
GO
UPDATE sysssispackages
SET ownersid = SUSER_SID('sa')
WHERE NAME = 'Name Of Maint Plan'

After making the owner as ‘sa’ and saving it again, the job executed successfully. The matter of fact, I have seen quite a lot in the real world when I execute something as username SA, it usually take care of all the security issues.

Do you know any other solution to fix the error? Please share via the comment section.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – FIX: Error – The Job Failed. Unable to Determine If The Owner Domain\User of Job Job_Name Has Server Access


SQL SERVER – FIX: Backup to URL Error: Operating System Error 50(The Request is Not Supported.)

$
0
0

It is always fun to work with “Backup to URL” feature of SQL Server. The error messages which are raised are from Azure storage side and an SQLDBA won’t be able to understand the meaning. While I was working with my VM to learn something about Backup to URL feature, I realized that my backups were failing. In this blog, let us learn how to fix the back to URL error: Operating system error 50(The request is not supported.). Here are the exact messages which I was getting in ERRORLOG.

2018-08-17 00:58:22.85 spid125 Error: 18204, Severity: 16, State: 1.
2018-08-17 00:58:22.85 spid125 BackupDiskFile::CreateMedia: Backup device ‘https://sqlauthbackup.blob.core.windows.net/backupcontainer/sqlauthdb_ebd0fe66f91f43f199c3b52d803bb136_20180814005822-07.log’ failed to create. Operating system error 50(The request is not supported.).
2018-08-17 00:58:22.85 Backup Error: 3041, Severity: 16, State: 1.
2018-08-17 00:58:22.85 Backup BACKUP failed to complete the command BACKUP LOG sqlauthdb. Check the backup application log for detailed messages.

I have already blogged about the same error earlier where the cause was different.

SQL SERVER – Backup to URL error: Operating system error 50(The request is not supported.)

In the current situation, this was the managed backup which was configured using the Azure portal. Recently I generated a new SAS token and updated it in the credential. Since then it was failing with an error.

WORKAROUND/SOLUTION

I actually performed to update the new SAS token by copying/pasting the value in the SSMS (below UI)

SQL SERVER - FIX: Backup to URL Error: Operating System Error 50(The Request is Not Supported.) BackupUrl-err-50-1

It didn’t take much time to realize that I missed removing “?” symbol from the SAS token. The SAS token on the portal starts from “?sv” and while creating a credential, we need to remove “?” and start the value from “sv”

I have done the same in the script which is available on my earlier blog.

SQL SERVER – Backup to URL – Script to Generate Credential and Backup using Shared Access Signature (SAS)

Have you encountered a similar error and found some other cause? Please share via the comment section.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – FIX: Backup to URL Error: Operating System Error 50(The Request is Not Supported.)

SQL SERVER – Upgrade Blocked: The Specified Edition Upgrade is Not Supported

$
0
0

As SQL 2008 R2 is out of support, my client was trying to upgrade to SQL Server 2012. In this blog, we would learn more about error message – Upgrade Blocked: The specified edition upgrade is not supported.

Hi Pinal,

We are into a situation and need your expert advice.  We are trying to upgrade from SQL Server 2008 R2 Express to SQL Server 2012 Evaluation edition and getting an error which is not expected.

“SQL Server 2012 Feature Upgrade Failed”

“The specified edition upgrade is not supported. For information about supported upgrade paths, see the SQL Server 2012 version and edition upgrade in Books Online.”

Its not expected because if you check the link it says “You can upgrade from SQL Server 2005, SQL Server 2008, and SQL Server 2008 R2 to SQL Server 2012.”

Can you help us and tell us what is wrong?

WORKAROUND/SOLUTION

I explained to my client that they have not read the complete article. Overall, upgrade from any edition to evaluation is not possible. If you want to test the upgrade process, then use SQL Server 2012 Express Edition. There is no direct upgrade path in any version to “Evaluation” edition. If they need to check the feasibility of enterprise edition, then they should install the Evaluation edition and restore their databases

Just for fun, I tried edition upgrade from SQL Server 2017 Express to SQL Server 2017 Evaluation edition and the error message is little more meaningful.

Rule “SQL Server 2017 edition upgrade” failed.
The selected SQL Server instance does not meet upgrade matrix requirements. The source Express edition upgrading from and the target Evaluation edition upgrading to is not allowed.

SQL SERVER - Upgrade Blocked:  The Specified Edition Upgrade is Not Supported upgrade-err-01

Have you encountered the same error earlier? Which was the option which you selected? Backup/restore or upgrading source to higher edition first?

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Upgrade Blocked: The Specified Edition Upgrade is Not Supported

SQL SERVER – FIX: CREATE FILE Encountered Operating System Error 5 (Access is Denied.)

$
0
0

One of my existing clients performed migration of disk in SQL Server failover cluster. After completing their planned activity, they found that SQL Server was not coming online in the cluster. In this blog, we would learn how to fix error CREATE FILE encountered operating system error 5(Access is denied.) on TempDB during SQL Server startup process.

SQL SERVER - FIX: CREATE FILE Encountered Operating System Error 5 (Access is Denied.) errorfile

When they contacted me, I captured a few more details which helped in finding the right corrective action. They had 2 node cluster. They added a new disk, swapped drive letters and copied-pasted all the SQL Server files there. After that, they tried bringing the SQL Server resource online. I captured a snippet of the ERRORLOG file.

2018-08-25 22:49:04.31 spid6s CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file ‘T:\TEMPLOGS\templog.ldf’.
2018-08-25 22:49:04.32 spid6s Error: 5123, Severity: 16, State: 1.
2018-08-25 22:49:04.32 spid6s CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file ‘T:\TEMPDATA\tempdb.mdf’.
2018-08-25 22:49:04.33 spid6s Error: 17204, Severity: 16, State: 1.
2018-08-25 22:49:04.33 spid6s FCB::Open failed: Could not open file T:\TEMPDATA\tempdb.mdf for file number 1. OS error: 2(The system cannot find the file specified.).
2018-08-25 22:49:04.33 spid6s Error: 5120, Severity: 16, State: 101.
2018-08-25 22:49:04.33 spid6s Unable to open the physical file “T:\TEMPDATA\tempdb.mdf”. Operating system error 2: “2(The system cannot find the file specified.)”.
2018-08-25 22:49:04.33 spid6s Error: 1802, Severity: 16, State: 4.
2018-08-25 22:49:04.33 spid6s CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
2018-08-25 22:49:04.33 spid6s Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.

From above we can clearly see that SQL Server is not able to create MDF and LDF file for TempDB database. Yes! It’s a Windows permissions issue. As we know SQL Server service account would be used to create the new files for TempDB database and clearly since they swapped the drive letters, they didn’t take care of the permission. Since the files are not created, we are seeing “OS error: 2(The system cannot find the file specified.).” subsequently.

WORKAROUND/SOLUTION

We need to note the folder in which SQL is trying to create the files. You need to check ERRORLOG in your environment. SQL SERVER – Where is ERRORLOG? Various Ways to Find ERRORLOG Location

As per messages in sample ERRORLOG above, it is T:\TEMPLOGS\ and T:\TEMPDATA\.

  • Right click on the folder, go to Properties > Security Tab > Edit Button > Add Button.
  • Add domain user/local user/group under whose account SQL services are running as. Make sure you provide Full Control to it.
  • Keep hitting OK to apply the changes.

If you are not aware of the steps to find service account for SQL Server then below blog should help. How to Find Service Account for SQL Server and SQL Server Agent? – Interview Question of the Week #179

Hopefully, this blog would help someone, who is not an SQLDBA, to fix this issue.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – FIX: CREATE FILE Encountered Operating System Error 5 (Access is Denied.)

SQL SERVER – How to DROP or DELETE Suspect Database?

$
0
0

SQL SERVER - How to DROP or DELETE Suspect Database? database One of my clients contacted me when there was business down situation. They had one of their production databases in suspect state and unable to drop it also. In this blog, we would learn about how to drop a suspect database.

When they contacted me, I started GoToMeeting and got connected in a few minutes. I asked them about the history but there was not much aware of it. This was one of the very critical databases for their business. They have regular backups taken and they wanted to restore the backup if things are not getting better in the next few minutes.

I tried to restart the SQL Service. the database shows “In-recovery” mode. When I checked the SQL ERRORLOG, it showed that the database started the recovery process. It went to recovery up-to 10% and then failed with a dump. Here is my earlier blog about various causes of dumps.

Here was the message, for any operation, we tried in the database.

Msg 922, Level 14, State 1, Line 1
Database ‘Database_Name’ is being recovered. Waiting until recovery is finished.

When I checked ERRORLOG, I found below:

<DateTime> spid23s     Error: 824, Severity: 24, State: 2.
<DateTime> spid23s     SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:67; actual 0:3342337). It occurred during a read of page (1:66) in database ID 15 at offset 0x00000000088000 in file’F:\SQLMDF\PROD_DB_Data.mdf’.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
<DateTime> spid23s     Error: 3414, Severity: 21, State: 1.
<DateTime> spid23s     An error occurred during recovery, preventing the database ‘PROD_DB’ (database ID 15) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.

SOLUTION/WORKAROUND

As we can see above, one we start SQL Service, the database goes through recovery and its failing there. Due to this, we were not able to drop the database. I have not seen this earlier as I was always able to drop such databases for restore. Here are the steps we did to drop the database and perform the restore:

  1. Stop the SQL Server service.
  2. Take a safe copy of existing MDF/NDF and LDF files.
  3. Rename the file (MDF or LDF or both) related to this database.
  4. Start the SQL Server service.

Since the files are not available the database startup would fail, and the database would go to “Recovery Pending” state. After this, we were able to drop the database. As I mentioned they were ready to restore it from the backup and it worked well.

Have you ever come across such a situation? What did you do to fix it?

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – How to DROP or DELETE Suspect Database?

SQL SERVER – Unable to Attach Database Files – The PageAudit Property is Incorrect – Ransomware Attack

$
0
0

Recently one of my clients contacted me after reading my blog above Ransomware on the SQL Server machine. In this blog, we would learn about error The PageAudit property is incorrect.

Here is my earlier blog on the ransomware topic. SQL SERVER – How to Protect Your Database from Ransomware?

Here were the errors in SQL Server ERRORLOG when the database was trying to startup.

Msg 5172, Level 16, State 15, Line 1
The header for file ‘E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf’ is not a valid database file header. The PageAudit property is incorrect.
Msg 945, Level 14, State 2, Line 1
Database ‘FINANCE_DB’ cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
Msg 5172, Level 16, State 15, Line 1
The header for file ‘E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf’ is not a valid database file header. The PageAudit property is incorrect.
Msg 922, Level 14, State 1, Line 1
Database ‘FINANCE_DB’ is being recovered. Waiting until recovery is finished.

Initially, I was not aware of the ransomware attack on this server. So, I used my usual method to read MDF file header without attaching it.

SQL SERVER – FIX – Error: One or more files do not match the primary file of the database

Here are the commands.

DBCC CHECKPRIMARYFILE(N'E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf',0); --IsMDF
DBCC CHECKPRIMARYFILE(N'E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf',2)
DBCC CHECKPRIMARYFILE(N'E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf',3)

As we can see from the output and the first command said it is not a valid file (output = 0) and the rest two commands failed with error:

SQL SERVER - Unable to Attach Database Files - The PageAudit Property is Incorrect - Ransomware Attack ran-01

Msg 5171, Level 16, State 1, Line 2
E:\SQLMDF\FINANCE_DB\FINANCE_DB.mdf is not a primary database file.

WORKAROUND/SOLUTION

There is nothing technical which a SQLDBA can do in this situation. I was able to find one good link which might help someone.

To find out which ransomware attacked you, please visit this site. Once ransomware name is identified, check if it is possible to decrypt files using known keys.

All the best to try various option to recover your data!

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Unable to Attach Database Files – The PageAudit Property is Incorrect – Ransomware Attack

SQL SERVER – Rebuild Index Job Failed – Error: 9002 – The Transaction Log for Database ‘PinalDB’ is Full Due to ‘LOG_BACKUP’

$
0
0

SQL SERVER - Rebuild Index Job Failed - Error: 9002 - The Transaction Log for Database 'PinalDB' is Full Due to 'LOG_BACKUP' wood-1 Sometimes we have an error in SQL Server job history which are not very accurate to find the cause of an issue. In this blog we would explore a situation where rebuild index job was failing with error 9002 – The transaction log for database ‘PinalDB’ is full due to ‘LOG_BACKUP’.

THE PROBLEM

In this situation, my client wanted to know the cause of rebuild index job failure. The complete error message in the job history was as follows:

Executing the query “ALTER INDEX [PK_auditID] ON [dbo].[tbl_audi…” failed with the following error: “The transaction log for database ‘PinalDB’ is full due to ‘LOG_BACKUP’.

The statement has been terminated.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.

THE ROOT CAUSE ANALYSIS APPROACH

If we look at the message it is clear that LDF file was full and that cause the rebuild index job to fail. My client informed that they are taking regular log backup so why LDF is full due to log backup reason? As a normal way to find the cause, I always ask to see SQL Server ERRORLOG and look at an interesting message at the same time when an issue was reported. If you are not familiar with SQL Server ERRORLOG then you must read my earlier blog on the same topic.

SQL SERVER – Where is ERRORLOG? Various Ways to Find ERRORLOG Location

In ERRORLOG, we could see below messages.

2018-09-06 05:00:07.36 Backup Log was backed up. Database: PINALDB, creation date(time): 2017/06/07(11:11:11), first LSN: 3384654:6547:1, last LSN: 3384654:6550:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {‘J:\Backup\PINALDB\PINALDB_backup_2018_09_06_050005_5978479.trn’}). This is an informational message only. No user action is required.
2018-09-06 05:51:10.57 spid160 Error: 9002, Severity: 17, State: 2.
2018-09-06 05:51:10.57 spid160 The transaction log for database ‘PINALDB’ is full due to ‘LOG_BACKUP’.
2018-09-06 06:00:01.76 spid16s Error: 9002, Severity: 17, State: 2.
2018-09-06 06:00:01.76 spid16s The transaction log for database ‘PINALDB’ is full due to ‘LOG_BACKUP’.
2018-09-06 06:00:01.76 spid16s Could not write a checkpoint record in database PINALDB because the log is out of space. Contact the database administrator to truncate the log or allocate more space to the database log files.
2018-09-06 06:00:01.77 spid16s Error: 5901, Severity: 16, State: 1.
2018-09-06 06:00:01.77 spid16s One or more recovery units belonging to database ‘PINALDB’ failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

Last two error messages giving more information and confirming that we were running out of log space. Further, I looked into Event Log and found a breakthrough message there.

09/06/2018 05:51:39 AM   Warning       2013    srv      The M: disk is at or near capacity.  You may need to delete some files.

The time is EXACTLY the same as a message in SQL ERRORLOG. I checked their database configuration using sys.database_files and found that M drive has only LDF file for this database.

Using the above information, I was able to provide Root Cause of the issue. Have you been into a situation where you have to look at multiple logs to find the root cause?

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Rebuild Index Job Failed – Error: 9002 – The Transaction Log for Database ‘PinalDB’ is Full Due to ‘LOG_BACKUP’

SQL SERVER – Scripts to Overview HADR / AlwaysOn Local Replica Server

$
0
0

Today, I have received a very interesting script from SQL Server Expert Dominic Wirth. He has written a very helpful script which displays a utilization overview of the local availability group replica server. The overview will contain the number of databases as well as the total size of databases (DATA, LOG, FILESTREAM) and is group by following two categories

  1. Replica role (PRIMARY / SECONDARY)
  2. Availability Group

Let us first see the script:

/*==================================================================
Script: HADR Local Replica Overview.sql
Description: This script will display a utilisation overview
of the local Availability Group Replica Server.
The overview will contain amount of databases as
well as total size of databases (DATA, LOG, FILESTREAM)
and is group by ...
1) ... Replica role (PRIMARY / SECONDARY)
2) ... Availability Group
Date created: 05.09.2018 (Dominic Wirth)
Last change: -
Script Version: 1.0
SQL Version: SQL Server 2014 or higher
====================================================================*/
-- Load size of databases which are part of an Availability Group
DECLARE @dbSizes TABLE (DatabaseId INT, DbTotalSizeMB INT, DbTotalSizeGB DECIMAL(10,2));
DECLARE @dbId INT, @stmt NVARCHAR(MAX);
SELECT @dbId = MIN(database_id) FROM sys.databases WHERE group_database_id IS NOT NULL;
WHILE @dbId IS NOT NULL
BEGIN
SELECT @stmt = 'USE [' + DB_NAME(@dbId) + ']; SELECT ' + CAST(@dbId AS NVARCHAR) + ', (SUM([size]) / 128.0), (SUM([size]) / 128.0 / 1024.0) FROM sys.database_files;';
INSERT INTO @dbSizes (DatabaseId, DbTotalSizeMB, DbTotalSizeGB) EXEC (@stmt);
SELECT @dbId = MIN(database_id) FROM sys.databases WHERE group_database_id IS NOT NULL AND database_id > @dbId;
END;
GO
-- Show utilisation overview grouped by replica role
SELECT AR.replica_server_name, DRS.is_primary_replica AS IsPrimaryReplica, COUNT(DB.database_id) AS [Databases]
,SUM(DBS.DbTotalSizeMB) AS SizeOfAllDatabasesMB, SUM(DBS.DbTotalSizeGB) AS SizeOfAllDatabasesGB
FROM sys.dm_hadr_database_replica_states AS DRS
INNER JOIN sys.availability_replicas AS AR ON DRS.replica_id = AR.replica_id
LEFT JOIN sys.databases AS DB ON DRS.group_database_id = DB.group_database_id
LEFT JOIN @dbSizes AS DBS ON DB.database_id = DBS.DatabaseId
WHERE DRS.is_local = 1
GROUP BY DRS.is_primary_replica, AR.replica_server_name
ORDER BY AR.replica_server_name ASC, DRS.is_primary_replica DESC;
GO
-- Show utilisation overview grouped by Availability Group
SELECT AR.replica_server_name, DRS.is_primary_replica AS IsPrimaryReplica, AG.[name] AS AvailabilityGroup, COUNT(DB.database_id) AS [Databases]
,SUM(DBS.DbTotalSizeMB) AS SizeOfAllDatabasesMB, SUM(DBS.DbTotalSizeGB) AS SizeOfAllDatabasesGB
FROM sys.dm_hadr_database_replica_states AS DRS
INNER JOIN sys.availability_groups AS AG ON DRS.group_id = AG.group_id
INNER JOIN sys.availability_replicas AS AR ON DRS.replica_id = AR.replica_id
LEFT JOIN sys.databases AS DB ON DRS.group_database_id = DB.group_database_id
LEFT JOIN @dbSizes AS DBS ON DB.database_id = DBS.DatabaseId
WHERE DRS.is_local = 1
GROUP BY AG.[name], DRS.is_primary_replica, AR.replica_server_name
ORDER BY AG.[name] ASC, AR.replica_server_name ASC;
GO

Here is the screenshot of the resultset which you will get if you run above script.

SQL SERVER - Scripts to Overview HADR / AlwaysOn Local Replica Server alwaysonreplica

There are many different kinds of reports which you can run via SQL Server Management Studio. However, sometimes the scripts simple as this script are very helpful and returns us quick results. You can further modify the above script to get additional details for your server as well.

Here are few additional scripts which also discusses the various concepts related to AlwaysOn.

If you have any other interesting script, please let me know and I will be happy to publish on the blog with due credit to you.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Scripts to Overview HADR / AlwaysOn Local Replica Server


SQL Authority News – Microsoft Released SQL Server 2019 Preview

$
0
0

The wait is over. Microsoft has just announced SQL Server 2019 Preview in the Ignite conference. Lots of people were wondering if there is going to be any SQL Server 2018. The answer is Microsoft is going to skip this year as a release for SQL Server and will release SQL Server 2019 early next year.

SQL Authority News - Microsoft Released SQL Server 2019 Preview sqlserver2019

Here are some of the important links which you want to keep bookmarked.

  • What’s New in SQL Server 2019 (Link)
  • SQL Server 2019 preview release notes (Link)
  • SQL Server Management Studio (SSMS) 18.0 (Link)
  • Azure Data Studio (Link)

If you are an early adopter, I strongly suggest that you to try out SQL Server 2019. I am going to install this on my VM very soon.

If you are SQL Server Performance Tuning expert you may not be too much excited. I looked at the various documents and here are the updates in SQL Server Engine parts.

  • Intelligent Query Processing
    • Row Mode Memory Grant Feedback
    • Approximate Count Distinct
    • Batch Mode on Rowstore
    • Table Variable Deferred Compilation
  • Resumable Online Index
  • Clustered Columnstore Online operation

Honestly, that’s it. I really wished Microsoft has come up with some enhancements in the area of SQL Server Performance Tuning. I am particularly not so excited about SQL Server Performance Tuning area, but I am very delighted with whatever the version has to offer to users.

To use SQL Server 2019, you have to change the compatibility level to 150

ALTER DATABASE DBName SET COMPATIBILITY_LEVEL = 150;

I will share more information as I learn about the latest version of SQL Server.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

First appeared on SQL Authority News – Microsoft Released SQL Server 2019 Preview

SQL SERVER 2019 – Oracle JRE 7 Update 51 (64 Bit) or Higher is Required

$
0
0

This is my second blog about SQL Server 2019. My previous blog was about the release announcement of SQL Server 2019 CTP version. SQL Authority News – Microsoft Released SQL Server 2019 Preview. In this blog post, we will discuss about Oracle JRE 7 Update.

As my habit, I always wanted to try new things and share my knowledge with my blog readers. I downloaded ISO file, mounted it and ran setup.exe. I kept moving in the wizard and selected all features. I was stuck at below screen.

SQL SERVER 2019 - Oracle JRE 7 Update 51 (64 Bit) or Higher is Required sql19-jre-01

(19) 2018-09-26 05:53:07 Slp: Initializing rule: Oracle JRE 7 Update 51 (64-bit) or higher is required for Polybase Java
(19) 2018-09-26 05:53:07 Slp: Rule is will be executed: True
(19) 2018-09-26 05:53:07 Slp: Init rule target object: Microsoft.SqlServer.Configuration.PolybaseJava.PolybaseJava_IsMinJavaVersionInstalledFacet
(19) 2018-09-26 05:53:07 SQLPolyBase: Could not find registry setting HKEY_LOCAL_MACHINE\SOFTWARE\JavaSoft\Java Runtime Environment\CurrentVersion.
(19) 2018-09-26 05:53:07 SQLPolyBase: Minimum version expected: 7.51. Java not found.
(19) 2018-09-26 05:53:07 Slp: Evaluating rule : PolybaseJava_IsMinJavaVersionInstalled
(19) 2018-09-26 05:53:07 Slp: Rule running on machine: SQL2019VM
(19) 2018-09-26 05:53:07 Slp: Rule evaluation done : Failed
(19) 2018-09-26 05:53:07 Slp: Rule evaluation message:
(19) 2018-09-26 05:53:07 Slp: Send result to channel : RulesEngineNotificationChannel

When we click on “View Detailed Report”, we would see below in html page.

SQL SERVER 2019 - Oracle JRE 7 Update 51 (64 Bit) or Higher is Required sql19-jre-02

I clearly remember this error because I blogged this earlier for SQL Server 2016.

SQL SERVER – 2016 FIX: Install – Rule “Oracle JRE 7 Update 51 (64-bit) or higher is required” failed

When I tried following my own blog, I realized that I was not able to find JRE 7 on the link which I have in my earlier blog.

WORKAROUND/SOLUTION

When I looked at Oracle site, I was not able to find JRE 7.  Here are the steps which worked as of today.

  1. Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html
  2. Scroll little down and look for “Java SE 10.0.2”
  3. On Right Side, you would see few “Download” buttons. Click on one below “JRE”
    SQL SERVER 2019 - Oracle JRE 7 Update 51 (64 Bit) or Higher is Required sql19-jre-03
  4. One next page opens, you can accept the agreement and then download a highlighted file.
    SQL SERVER 2019 - Oracle JRE 7 Update 51 (64 Bit) or Higher is Required sql19-jre-04
  5. Install the executable

After this, I was able to press the “Re-Run” button and proceed with the setup.

SQL SERVER 2019 - Oracle JRE 7 Update 51 (64 Bit) or Higher is Required sql19-jre-05

Hope this new version of an article would help you in installing SQL Server 2019. Please comment and let me know.

Reference: Pinal Dave (https://blog.sqlauthority.com)

First appeared on SQL SERVER 2019 – Oracle JRE 7 Update 51 (64 Bit) or Higher is Required

SQL SERVER – Unable to Launch SQL Server Configuration Manager. Error: Cannot Connect to WMI Provider. [0x80070422]

$
0
0

While playing with my virtual machine, I encountered an interesting error. In this blog we would learn how to fix error cannot connect to WMI provider [0x80070422] while launching SQL Server Configuration Manager.

As I have been writing in my earlier blogs, I always use my virtual machine as my playground to break things and fix it. If I can fix it, I write a blog and if I can’t fix it then I restore the VM to last known good image. After making some changes, when I tried launching SQL Server configuration manager, I was welcomed by below error message.

SQL SERVER - Unable to Launch SQL Server Configuration Manager. Error: Cannot Connect to WMI Provider. [0x80070422] sscm-err-01

Here is the text of error message.

Cannot connect to WMI provider. You do not have permission or the server is unreachable. Note that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager.
The service cannot be started, either because it is disabled or because it has no enabled devices associated with it. [0x80070422]

The first part of the error message is quite generic. The second part is the one which explains the real cause. It talks about some service which is needed. Based on my experience I knew that it is all about WMI Service. If you get same hex code 0x80070422 then you should check services on that machine. You might find the Windows Management Instrumentation (WMI) Service is set as “Disabled”

SQL SERVER - Unable to Launch SQL Server Configuration Manager. Error: Cannot Connect to WMI Provider. [0x80070422] sscm-err-02

WORKAROUND/SOLUTION

As you might have guessed that we need to enable Windows Management Instrumentation (WMI) service to get it working. This service should be healthy for SQL Server Configuration Manager to work properly. Here are the steps to enable the service.

  1. Click on Start and go to Run
  2. Type “Services.msc” and hit enter.
  3. Scroll down to locate “Windows Management Instrumentation”
  4. Right-click on that service and go to properties.
  5. From the drop-down, choose “Automatic”.
  6. Hit OK to save the settings.

After following the above, you should be able to launch the SQL Server Configuration Manager without any error.  Have you found any other cause of such error caused by WMI?

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Unable to Launch SQL Server Configuration Manager. Error: Cannot Connect to WMI Provider. [0x80070422]

SQL SERVER – SQL Server Configuration Manager Error: 0x80010108 – The Object Invoked Has Disconnected From its Clients

$
0
0

While writing my previous blogs about SQL Server Configuration Manager, I encountered another error. In this blog, we would learn one of the possible causes of error 0x80010108 – The object invoked has disconnected from its clients.

Here is the screenshot of the error message.

SQL SERVER - SQL Server Configuration Manager Error: 0x80010108 - The Object Invoked Has Disconnected From its Clients sscm-hex-10108-01

WORKAROUND/SOLUTION

When I searched for the hex code on the internet, it was saying RPC_E_DISCONNECTED and looks like that is the code of the text of the message. When I looked back at the series of action I did, the error was reproducible.

  1. Open SQL Server Configuration Manager (SSCM).
  2. Click on SQL Server Services to view the list of services.
  3. Stop Windows Management Instrumentation Service using Services applet (Start > Run > Services.msc)
  4. Switch back to SSCM and refresh the services.
  5. You should see the error “The object invoked has disconnected from its clients. [0x80010108]”

Once you close and reopen SSCM, it should automatically start WMI service and we should be able to see the list of services again.

We simulated the error by stopping WMI Service manually. So, whenever you see such error with SQL Server Configuration Manager, you should figure out why WMI service was stopped. You can start with Event Viewer to know if there are any other interesting events.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – SQL Server Configuration Manager Error: 0x80010108 – The Object Invoked Has Disconnected From its Clients

SQL SERVER – Error: 45207 – SQL Server Managed Backup to Microsoft Azure Cannot Configure the Database Because a Container URL was Either not Provided or Invalid

$
0
0

While using SQL Server on Virtual Machines in Azure, I ran into an interesting error.  In this blog we would learn how to fix Msg 45207 – SQL Server Managed Backup to Microsoft Azure cannot configure the database ‘sqlauthdb’ because a container URL was either not provided or invalid. It is also possible that your SAS credential is invalid.

When I deployed SQL Server Azure Virtual machine, I enabled a feature called “Automatic Backup”. Due to this setting SQL Server was taking regular backups on blob storage. Since I am not running a production server, I decided to minimize the cost by deleting the unwanted resource. So, I delete the storage account. Now I noticed that SQL Server started giving an error in SQL ERRORLOG about backup failures, so I decided to disable this feature.

When I did from the Azure portal, the disabling operation failed with below error:

Error type

At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details. (Code: DeploymentFailed)

Error details

The resource operation completed with terminal provisioning state ‘Failed’. (Code: ResourceDeploymentFailure)

VM has reported a failure when processing extension ‘SqlIaasExtension’. Error message: “SQL Server IaaS Agent: SQL Server Managed Backup to Microsoft Azure cannot configure the database ‘sqlauthdb’ because a container URL was either not provided or invalid. It is also possible that your SAS credential is invalid. ;The creator of this fault did not specify a Reason.;Automated Patching: Automated Patching enabled: False, Windows Update state: ScheduledInstallation, VM is up to date in applying important updates.;Automatic Telemetry: Performance Collector State: DisabledOptedOut”. (Code: VMExtensionProvisioningError)

Here is the screenshot for the same error.

SQL SERVER - Error: 45207 - SQL Server Managed Backup to Microsoft Azure Cannot Configure the Database Because a Container URL was Either not Provided or Invalid disable-auto-bkp-01

I then decided to disable from SQL Server using managed backup related stored procedure. I executed below code:

EXEC managed_backup.sp_backup_config_basic  
	@enable_backup=0 
	,@database_name = 'sqlauthdb'

Well, it failed with an exact same error which we got from the portal.

Msg 45207, Level 17, State 11, Procedure managed_backup.sp_add_task_command, Line 102 [Batch Start Line 8]

SQL Server Managed Backup to Microsoft Azure cannot configure the database ‘sqlauthdb’ because a container URL was either not provided or invalid. It is also possible that your SAS credential is invalid.

There is nothing wrong with the error message. I have deleted the storage account, so URL is definitely invalid. What should I do now?

I had two choices:

  1. Create same storage account again with new SAS token given in SQL credential so that SQL can connect to storage and disable it.
  2. Find a way to cleanup all settings related to managed backup in SQL Server.

I am a lazy guy and I wanted to get things done by choice # 2.

WORKAROUND/SOLUTION

While looking at msdb stored procedures, I came across an interesting procedure. (thanks to intellisense feature of SSMS) autoadmin_metadata_delete

SQL SERVER - Error: 45207 - SQL Server Managed Backup to Microsoft Azure Cannot Configure the Database Because a Container URL was Either not Provided or Invalid disable-auto-bkp-02

When I looked at the code of the stored procedure, it says below

— Procedure to delete entries in metadata tables

Perfect! This is what I was looking for. Here is the code which I ran, and it magically cleaned up everything.

NOTE: I must mention that you should use this with caution in production as it deletes everything about managed backup for all databases. Also, it is not documented on MDSN so Microsoft might remove it later from the product.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Error: 45207 – SQL Server Managed Backup to Microsoft Azure Cannot Configure the Database Because a Container URL was Either not Provided or Invalid

SQL SERVER – SSIS – Package Error: “ODBC Source” Failed Validation and Returned Validation Status “VS_NEEDSNEWMETADATA”

$
0
0

As a part of my consulting, I see many clients who use SSIS. In this blog we would learn how to fix error “ODBC Source” failed validation and returned validation status “VS_NEEDSNEWMETADATA”.

My client was using SSIS to transform the data from MySQL to SQL Server for reporting purpose. One fine day they found that SSIS package was failing with an error.

SQL SERVER - SSIS - Package Error: "ODBC Source" Failed Validation and Returned Validation Status "VS_NEEDSNEWMETADATA" ssis-err-001

“ConnectionToSQL” failed validation and returned validation status “VS_NEEDSNEWMETADATA”.

Since I have helped them in the past, they contacted me to seek help on SSIS. Even though it is not my expert area, I agreed to get engage with them and learn something new. I search for the error on the internet and found that this is “Needs New Metadata”. But what metadata? Well, its metadata about the schema of the source or destination table. If the schema that is being obtained from the database has changed, the data source will fail with the VS_NEEDSNEWMETADATA error. Generally, it happens when we alter the table by adding a new column or removing an existing column.

WORKAROUND/SOLUTION

As explained earlier, this is mostly caused due to a schema change of source or destination data sources. SSIS is schema bound and any change in table schema needs new metadata to be populated to SSIS as well. The solution which worked for my client was to just right-click on the data source and then select edit. It automatically prompted us with a question asking if it should fix the metadata. Choosing “yes” solved the problems.

Based on my search on the internet, I found a few more solution. Sharing them here as a single place to fix such error:

  1. Another thing to watch out for is the case sensitivity of the column names referred in package vs tables. Based on my internet search, one of them had the same error due to not having the same case.
  2. It can also happen when mappings are incorrect or there are ambiguous columns in the file. Maybe the column name is defined twice.
  3. If none of above help, then we need to get verbose logging of the package. To do that, you can run the package via command line using DTExec. (dtexec Utility)
DTEXEC /FILE PackageName.dtsx

This should give you more information and should point to the column which is having the problem.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – SSIS – Package Error: “ODBC Source” Failed Validation and Returned Validation Status “VS_NEEDSNEWMETADATA”

SQL SERVER – Event ID: 10028 – SQL Server Distributed Replay Client – DCOM was Unable to Communicate with the Computer

$
0
0

Sometimes there are many errors in the event logs which we ignore, and they fill up complete event log. When we have some issue and data is needed for root cause analysis, we do not see the actual data because its overwritten by meaningless messages. In this blog, we would learn how to fix Event ID: 10028 – SQL Server Distributed Replay Client – DCOM was unable to communicate with the computer.

My client contacted me when he was supposed to provide root cause analysis for the issue, but event log was showing many messages like below.

Log Name: System
Source: Microsoft-Windows-DistributedCOM
Event ID: 10028
Task Category: None
Level: Error
Keywords: Classic
User: NT SERVICE\SQL Server Distributed Replay Client
Description:
DCOM was unable to communicate with the computer DREPLAY using any of the configured protocols; requested by PID 20f4 (C:\Program Files (x86)\Microsoft SQL Server\140\Tools\DReplayClient\DReplayClient.exe).

As we can see below, the message comes every 5 seconds.

SQL SERVER - Event ID: 10028 - SQL Server Distributed Replay Client – DCOM was Unable to Communicate with the Computer dcom-dreplay-err-01

When I looked at the message, my first question was “Are you using the Distributed Replay feature of SQL Server”. Interestingly, he was even not aware of the feature and answer was negative. His goal was to get rid of those messages which are continuously logged in “System Event Log” at a very rapid rate and filling it (every 5 seconds).

WORKAROUND/SOLUTION

If you are not using the feature and want to avoid these messages, then stop below service.

SQL SERVER - Event ID: 10028 - SQL Server Distributed Replay Client – DCOM was Unable to Communicate with the Computer dcom-dreplay-err-02

  1. SQL Server Distributed Replay Client.
  2. SQL Server Distributed Replay Controller.

You also may want to change the service startup to manual to avoid the same issue popping up after next reboot.

As soon as I stopped the service, the messages stopped coming in the event log.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Event ID: 10028 – SQL Server Distributed Replay Client – DCOM was Unable to Communicate with the Computer


SQL SERVER – Event ID: 2004 – Resource Exhaustion Diagnosis Events

$
0
0

Recently I was working with a client where they were having an issue with SQL Server memory usage. They received below warning message in System Event log – Resource Exhaustion Diagnosis Events.

Log Name: System
Source: Microsoft-Windows-Resource-Exhaustion-Detector
Event ID: 2004
Task Category: Resource Exhaustion Diagnosis Events
Level: Warning
Keywords: Events related to exhaustion of system commit limit (virtual memory).
User: NT AUTHORITY\SYSTEM
Description: Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: sqlservr.exe (2303) consumed 129733837434 bytes, ReportingServicesService.exe (17805) consumed 6491374460 bytes, and msmdsrv.exe (1643) consumed 2905451451 bytes.

I have never seen such a message earlier so did some more research and found some interesting information. I am sharing it here for the benefit of my blog readers.

As you can see the source of the error message and tell that this message is a part of operating system resource monitoring.  The important part of this error message is that it shows top consumers of virtual memory. In the above sample from my client, they are executable of SQL Server, Reporting Services and Analysis services.

It was clear that SQL Server process was using lots of virtual memory. I started analyzing DBCC MEMORYSTATUS output but none of the memory clerks was consuming memory.  This would mean that some other process within SQL is doing it? I checked loaded modules DMV in SQL Server and found something interesting.

SELECT description, *
FROM sys.dm_os_loaded_modules
WHERE description like 'McAfee%'

SQL SERVER - Event ID: 2004 - Resource Exhaustion Diagnosis Events low-mem-01

WORKAROUND/SOLUTION

After a lot of search on the internet, I was able to find that McAfee has a known issue where SQL Server would consume more memory. Memory leak in sqlserver.exe process with the Host Intrusion Prevention 8.0 SQL engine

I checked client version and they were on lower version of Host IPS. Once they applied the latest version of Host IPS, the issue was resolved, and we did not see those messages again.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

First appeared on SQL SERVER – Event ID: 2004 – Resource Exhaustion Diagnosis Events

SQL SERVER – SQL Server Management Studio Crash While Using Backup to URL or Connecting to Storage

$
0
0

Recently, as a part of my on-demand consulting, I was helping a client who was into a disaster situation and there was an urgent need to restore the backup of a database which was taken in Azure Blob Storage. In this blog, we will learn about how to fix the crash of SQL server Management studio while using a backup to URL or connecting to storage.

As explained earlier, the client was trying to restore the database backup which was stored in Azure blob storage. They were going to “Restore Database” menu option in SSMS, choosing the device as URL. As soon as they click on the add button, SQL Server Management Studio was crashing. Here is a screenshot.

SQL SERVER - SQL Server Management Studio Crash While Using Backup to URL or Connecting to Storage ssms-crash-url-01

Here are the details which you could see by clicking on “view problem details” button.

Problem signature:
Problem Event Name: CLR20r3
Problem Signature 01: Ssms.exe
Problem Signature 02: 2014.120.5571.0
Problem Signature 03: 5a56a398
Problem Signature 04: Microsoft.SqlServer.RegSvrEnum
Problem Signature 05: 12.0.5000.0
Problem Signature 06: 5764ad48
Problem Signature 07: 7
Problem Signature 08: da
Problem Signature 09: System.Exception
OS Version: 6.3.9600.2.0.0.400.8
Locale ID: 1033
Additional Information 1: e01e
Additional Information 2: e01e71249cad1577f3cd863e8d1ab175
Additional Information 3: 1edb
Additional Information 4: 1edbb4ca04145d7b8df23b25a086703c

Above information could not help in finding the cause but I have shared here so that someone can search and reach to this blog.

Then I tested the same steps on my SQL Server Management Studio and it was asking to connect to a storage account from where the backup can be picked for restore purpose. So, I asked my client to connect to storage directly by using SQL server Management studio using the below option.

SQL SERVER - SQL Server Management Studio Crash While Using Backup to URL or Connecting to Storage ssms-crash-url-02

Strangely, we could see the same crash of SQL Server Management studio there as well. This test confirmed that there is some issue with the information stored by SQL Server Management Studio about the storage account. My client has been using the same SSMS and in the past, he was able to connect to Azure storage.

WORKAROUND/SOLUTION

I have asked my client to use some different machine and try to connect to the Azure storage from that machine. Interestingly, it worked, and it confirmed our theory.

Now we need to figure out, how to clean up the information stored about storage account in the cache of SSMS. I was not able to figure out the way to clean up only Azure storage-related information from the user profile. So, I ended up in removing complete saved settings of SSMS by deleting a sqlstudio.bin file from %appdata% user profile. This file “sqlstudio.bin” is located under below location. (Go to Start > Run and paste the below path)

%AppData%\Microsoft\SQL Server Management Studio

Once the folder is opened on your server, you might see a folder like below.

SQL SERVER - SQL Server Management Studio Crash While Using Backup to URL or Connecting to Storage ssms-crash-url-03

Based on the SSMS version, go inside the folder and delete “sqlstudio.bin” file. (You can also rename the file and SSMS would create a new one).

Number SQL Server Management Studio (SSMS) Version
11.0 SQL Server 2012
12.0 SQL Server 2014
13.0 SQL Server 2016
14.0 SQL Server 2017
18.0 SSMS 18.0 (separated from SQL Server)

Please keep in mind that, this is not smartest solutions available as it would delete all saved information in SSMS (like username password, server names list, any settings which you have changed etc.)

Reference: Pinal Dave (https://blog.sqlauthority.com)

First appeared on SQL SERVER – SQL Server Management Studio Crash While Using Backup to URL or Connecting to Storage

Azure Virtual Machine – You Must Change Your Password Before Logging On The First Time

$
0
0

One fine day when I tried connecting to my Azure Virtual Machine in a cloud, I was welcomed with this error message.

Azure Virtual Machine - You Must Change Your Password Before Logging On The First Time azure-rdp-01

Tip: If you press Ctrl + C on the message window, it will copy text on the clipboard and if you paste, it comes like below. Neat and Clean!

[Window Title]

Remote Desktop Connection

[Content]

You must change your password before logging on the first time. Please update your password or contact your system administrator or technical support.

[OK] [Help]

I clearly remember that I have not done anything with the user account. Also, the machine was NOT joined to any domain.

WORKAROUND/SOLUTION

I knew that I need to change the password but how to get to the prompt to change the password as this is an Azure Virtual Machine.

The easiest way to reset the password is to use Azure Portal. Open Azure Portal, Go to “Virtual Machines”, go under click “Reset password” under “Support + Troubleshooting

Azure Virtual Machine - You Must Change Your Password Before Logging On The First Time azure-rdp-02

As the message says, we can also provide a new account and the password. After providing details, hit “Update” and wait for it to finish. It shows below the screen.

Azure Virtual Machine - You Must Change Your Password Before Logging On The First Time azure-rdp-03

Once done, you should be able to login to VM using the new account (if a new account is provided during reset) or an existing account (for which you have given new password)

Hope this would help someone who runs into error on Azure Virtual Machine.

Reference: Pinal Dave (https://blog.sqlauthority.com)

First appeared on Azure Virtual Machine – You Must Change Your Password Before Logging On The First Time

SQL SERVER – Timeout Occurred While Waiting for Latch: Class FGCB_ADD_REMOVE

$
0
0

SQL SERVER - Timeout Occurred While Waiting for Latch: Class FGCB_ADD_REMOVE warning One of my clients contacted me to resolve an issue where they were seeing these errors few minutes after restarting SQL Server Service. In this blog, we would learn about how to fix error Timeout occurred while waiting for latch: class FGCB_ADD_REMOVE.

As soon as I got on the call with them, my first question was where they are seeing the latch error and they showed me SQL Server ERRORLOG. Here are the messages in SQL Server ERRORLOG file.

2018-09-25 10:32:00.74 spid423     Timeout occurred while waiting for latch: class ‘FGCB_ADD_REMOVE’, id 00000000807D59B8, type 2, Task 0x00000000C896FDC8 : 0, waittime 300, flags 0x1a, owning task 0x00000000820BE608. Continuing to wait.

2018-09-25 10:32:08.80 spid419     Timeout occurred while waiting for latch: class ‘FGCB_ADD_REMOVE’, id 00000000807D59B8, type 2, Task 0x00000000B2DB7948 : 0, waittime 300, flags 0x1a, owning task 0x00000000820BE608. Continuing to wait.

2018-09-25 10:32:09.81 spid444     Timeout occurred while waiting for latch: class ‘FGCB_ADD_REMOVE’, id 00000000807D59B8, type 2, Task 0x0000000082581288 : 0, waittime 300, flags 0x1a, owning task 0x00000000820BE608. Continuing to wait.

I checked documentation and found below: sys.dm_os_latch_stats (Transact-SQL)

As per above, latch class FGCB_ADD_REMOVE is related filegroups for ADD and DROP file operations. So, if SQL is performing an operation that will change the database files (add, remove, grow, shrink, rename) but they’re being blocked then we may see this.

I asked them to capture XEvents to capture more data. After analysis, I was able to see that SQL Server was trying to grow a database data file and was stuck waiting for the file to grow.

WORKAROUND/SOLUTION

There are few things to check in such situation:

  1. Make sure that auto-growth setting is not set to high value. I always prefer to have it as a fixed size in MB rather than Percentage. I always go with 512 or 1024 MB auto-growth value.
  2. Make sure the disk subsystem is healthy. . Even if the size of the growth is small, due to disk slowness, it might take more time to complete the growth and might time out.
  3. Make sure that instant file initialization is enabled. In SQL Server, data files can be initialized instantaneously. Instant file initialization allows for fast execution of the previously mentioned file operations. Instant file initialization reclaims used disk space without filling that space with zeros. Refer my below articles.

In my client’s server, it was ca ombination of 1 and 3. We reduced the growth size and enabled Instant File Initialization. They never faced above-mentioned error after following my guidance.

Have you ever seen this kind of latch error in SQL Server?

Reference: Pinal Dave (https://blog.sqlauthority.com)

First appeared on SQL SERVER – Timeout Occurred While Waiting for Latch: Class FGCB_ADD_REMOVE

SQL SERVER – Always On AG – HADRAG: Did not Find the Instance to Connect in SqlInstToNodeMap Key

$
0
0

During my On Demand (50 Minutes) consultancy, I solve the issue which seems quick to my client. SQL not starting, AlwaysOn not failing over, Cluster not working are few of quick things where my clients engage me. In this blog, I would share a situation where Always On Availability Group was not coming online due to error – Did not find the instance to connect in SqlInstToNodeMap key.

THE SITUATION

There was some instability in a cluster which caused few unexpected failovers of always-on availability group from node1 to node2 – back and forth sometimes. When they contacted me, we found that clustered resource for availability group was not coming online.

My first step, always, is to get the error what is being reported by SQL or Cluster or Windows. Event log reported below error:

Cluster resource ‘PRODAG’ of type ‘SQL Server Availability Group’ in clustered role ‘PRODAG’ failed.

SQL SERVER - Always On AG - HADRAG: Did not Find the Instance to Connect in SqlInstToNodeMap Key alwaysonerror

Based on the failed policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Above error is very generic and does not tell more than what we know already.

When I checked the SQL Server Management studio we saw that the secondary replica is not connected to the primary replica. The connected state is “DISCONNECTED” in DMV and it shows “red” symbol for this replica. Next step was to generate a Cluster log.

SQL SERVER – Steps to Generate Windows Cluster Log?

And BINGO! We were able to see some relevant messages there.

INFO  [RES] SQL Server Availability Group <PRODAG>: [hadrag] The DeadLockTimeout property has a value of 300000
INFO  [RES] SQL Server Availability Group <PRODAG>: [hadrag] The PendingTimeout property has a value of 180000
ERR   [RES] SQL Server Availability Group <PRODAG>: [hadrag] Did not find the instance to connect in SqlInstToNodeMap key.
ERR   [RHS] Online for resource PRODAG failed.

“ERR” is the tag I look for in cluster log and you should focus on. Just before failure, we see this error: Did not find the instance to connect in SqlInstToNodeMap key. I search and found that SqlInstToNodeMap is a registry key which should have the same information as sys.dm_hadr_instance_node_map.

When I checked the primary replica, we were not able to see the AG under “availability group” node in SSMS. Also, there were no replicas listed under “availability replica” node. When we tried querying sys.dm_hadr_database_replica_states, we did not get any results.

WORKAROUND/SOLUTION

All above symptoms mean that there is some metadata mismatch between information in cluster and information in SQL Server. Even both replicas are having a mismatch of information about availability group. We ran below command on secondary to remove information about AG. We were not able to use UI and it was giving an error.

DROP AVAILABILITY GROUP PRODAG

As soon as we executed, the databases were in restoring state and AG information was cleared from all DVMs and cluster also. Then we recreated the availability group using the AG wizard and we were back in business in less than 20 min of call with me.

I truly hope that this blog can help someone who is getting the same issue with AG.

Reference: Pinal Dave (https://blog.sqlauthority.com)

First appeared on SQL SERVER – Always On AG – HADRAG: Did not Find the Instance to Connect in SqlInstToNodeMap Key

Viewing all 594 articles
Browse latest View live